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Abstract 

A  fundamental  requirement  in  realistic  ocean  simulations  and  dynamical  studies 
is  the  optimal  estimation  of  gridded  fields  from  the  spatially  irregular  and  multi¬ 
variate  data  sets  that  are  collected  by  varied  platforms.  In  this  work,  we  derive 
and  utilize  new  schemes  for  the  mapping  and  dynamical  inference  of  ocean  fields 
in  complex  multiply-connected  domains  and  study  the  computational  properties 
of  these  schemes.  Specifically,  we  extend  a  multiscale  Objective  Analysis  (OA) 
approach  to  complex  coastal  regions  and  archipelagos.  Bayesian-based  OAs  using 
covariances  as  inputs  commonly  require  an  estimate  of  the  distances  between  data 
and  model  points,  without  going  across  complex  landforms.  New  OA  schemes 
based  on  estimating  the  length  of  shortest  sea  paths  using  the  Level  Set  Method 
(LSM)  and  Fast  Marching  Method  (FMM)  are  thus  derived,  implemented  and  uti¬ 
lized  in  idealized  and  realistic  ocean  cases.  An  FMM-based  methodology  for  the 
estimation  of  total  velocity  under  geostrophic  balance  in  complex  domains  is  also 
presented.  Comparisons  with  other  OA  approaches  are  provided,  including  those 
using  stochastically  forced  partial  differential  equations  (SPDEs).  We  find  that 
the  FMM-based  OA  scheme  is  the  most  efficient  and  accurate.  We  also  show  that 
the  FMM-based  field  maps  do  not  require  postprocessing  (smoothing).  Mathemat¬ 
ical  and  computational  properties  of  our  new  OA  schemes  are  studied  in  detail, 
using  fundamental  theorems  and  illustrations.  We  find  that  higher-order  FMM’s 
schemes  improve  accuracy  and  that  a  multi-order  scheme  is  efficient.  We  also  pro¬ 
vide  solutions  that  ensure  the  use  of  positive-definite  covariances,  even  in  complex 


*We  are  grateful  to  the  Office  of  Naval  Research  for  research  support  under  grants 
N00014-07- 1-1061  and  N00014-08-1-1097  (ONR6.1),  N00014-07-1-0473  (PhilEx),  S05-06 
and  N00014-08-1-0680  (PLUS)  to  the  Massachusetts  Institute  of  Technology. 


Preprint  submitted  to  Ocean  Modelling 


April  9,  2011 


Report  Documentation  Page 


Form  Approved 
OMB  No.  0704-0188 


Public  reporting  burden  for  the  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions,  searching  existing  data  sources,  gathering  and 
maintaining  the  data  needed,  and  completing  and  reviewing  the  collection  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this  collection  of  information, 
including  suggestions  for  reducing  this  burden,  to  Washington  Headquarters  Services,  Directorate  for  Information  Operations  and  Reports,  1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington 
VA  22202-4302.  Respondents  should  be  aware  that  notwithstanding  any  other  provision  of  law,  no  person  shall  be  subject  to  a  penalty  for  failing  to  comply  with  a  collection  of  information  if  it 
does  not  display  a  currently  valid  OMB  control  number. 


1.  REPORT  DATE 

09  APR  2011 


2.  REPORT  TYPE 


4.  TITLE  AND  SUBTITLE 

Statistical  Field  Estimation  for  Complex  Coastal  Regions  and 
Archipelagos 

6.  AUTHOR(S) 


7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

Massachusetts  Institute  of  Technology, Cambridge, MA, 02139 

9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 


3.  DATES  COVERED 

00-00-2011  to  00-00-2011 

5a.  CONTRACT  NUMBER 

5b.  GRANT  NUMBER 

5c.  PROGRAM  ELEMENT  NUMBER 

5d.  PROIECT  NUMBER 

5e.  TASK  NUMBER 

5f.  WORK  UNIT  NUMBER 

8.  PERFORMING  ORGANIZATION 
REPORT  NUMBER 


10.  SPONSOR/MONITOR'S  ACRONYM(S) 

11.  SPONSOR/MONITOR'S  REPORT 
NUMBER(S) 


12.  DISTRIBUTION/AVAILABILITY  STATEMENT 

Approved  for  public  release;  distribution  unlimited 

13.  SUPPLEMENTARY  NOTES 

14.  ABSTRACT 

A  fundamental  requirement  in  realistic  ocean  simulations  and  dynamical  studies  is  the  optimal  estimation 
of  gridded  fields  from  the  spatially  irregular  and  multivariate  data  sets  that  are  collected  by  varied 
platforms.  In  this  work,  we  derive  and  utilize  new  schemes  for  the  mapping  and  dynamical  inference  of 
ocean  fields  in  complex  multiply-connected  domains  and  study  the  computational  properties  of  these 
schemes.  Specifically,  we  extend  a  multiscale  Objective  Analysis  (OA)  approach  to  complex  coastal  regions 
and  archipelagos.  Bayesian-based  OAs  using  covariances  as  inputs  commonly  require  an  estimate  of  the 
distances  between  data  and  model  points,  without  going  across  complex  landforms.  New  OA  schemes  based 
on  estimating  the  length  of  shortest  sea  paths  using  the  Level  Set  Method  (LSM)  and  Fast  Marching 
Method  (FMM)  are  thus  derived,  implemented  and  utilized  in  idealized  and  realistic  ocean  cases.  An 
FMM-based  methodology  for  the  estimation  of  total  velocity  under  geostrophic  balance  in  complex 
domains  is  also  presented.  Comparisons  with  other  OA  approaches  are  provided,  including  those  using 
stochastically  forced  partial  differential  equations  (SPDEs).  We  find  that  the  FMM-based  OA  scheme  is  the 
most  efficient  and  accurate.  We  also  show  that  the  FMM-based  field  maps  do  not  require  postprocessing 
(smoothing).  Mathematical  and  computational  properties  of  our  new  OA  schemes  are  studied  in  detail 
using  fundamental  theorems  and  illustrations.  We  find  that  higher-order  FMM?s  schemes  improve 
accuracy  and  that  a  multi-order  scheme  is  efficient.  We  also  provide  solutions  that  ensure  the  use  of 
positive-definite  covariances,  even  in  complex  multiply-connected  domains. 

15.  SUBIECT  TERMS 


16.  SECURITY  CLASSIFICATION  OF: 

17.  LIMITATION  OF 

18.  NUMBER 

19a.  NAME  OF 

ABSTRACT 

OF  PAGES 

RESPONSIBLE  PERSON 

a.  REPORT 

unclassified 

b.  ABSTRACT 

unclassified 

c.  THIS  PAGE 

unclassified 

Same  as 
Report  (SAR) 

63 

Standard  Form  298  (Rev.  8-98) 

Prescribed  by  ANSI  Std  Z39-18 


multiply-connected  domains. 

Key  words: 

Field  mapping,  Gauss-Markov  estimation,  Coastal  objective  analysis,  Fast 
Marching  Method,  Level  Set  Method,  Geostrophy,  Levitus  climatology,  World 
Ocean  Atlas  (WOA),  Archipelago. 


1.  Introduction  and  Motivation 

Statistical  field  estimation  theory  was  introduced  by  Gandin  (1965)  to  the 
field  of  meteorology  and  was  extended  to  oceanography  by  Bretherton  et  al. 
(1976)  where  it  is  commonly  referred  to  as  Objective  Analysis  (OA).  The  the¬ 
ory  is  based  on  the  Gauss-Markov  theorem  (Plackett,  1950),  and  it  provides 
a  sound  basis  for  interpolating  irregularly  spaced  data  onto  a  computational 
grid.  Up  to  specifics  of  oceanic  and  atmospheric  fields,  for  example  the  mul¬ 
tiple  scales,  the  OA  scheme  is  equivalent  to  utilize  the  update  steps  of  the 
Kalman  Filter  to  grid  the  irregularly-spaced  data.  Specifically,  the  data  is 
gridded  based  on  specified  prior  field  estimates  and  error  covariance  matri¬ 
ces.  The  OA  methodology  has  been  well  formulated  for  open  oceans  without 
any  landforms  (convex  simply-connected  domains),  but  the  OA  in  complex 
coastal  regions  (multiply-connected  domains)  is  one  of  the  ‘last’  mapping 
problems  which  remains  to  be  studied  in  detail.  This  is  one  of  the  main 
research  questions  of  the  present  work. 

Our  research  is  completed  using  the  Multidisciplinary  Simulation,  Es¬ 
timation  and  Assimilation  System  (Haley  and  Lermusiaux,  2010;  MSEAS, 
2010).  MSEAS  consists  of  a  set  of  mathematical  models  and  computational 
methods  for  ocean  predictions  and  dynamical  diagnostics,  for  data  assim¬ 
ilation  and  data-model  comparisons,  and  for  optimization  and  control  of 
autonomous  ocean  observation  systems.  It  is  used  for  basic  and  fundamen¬ 
tal  research  and  for  realistic  simulations  and  predictions,  recently  includ¬ 
ing  monitoring  (Lermusiaux,  2007),  real-time  acoustic-ocean  predictions  (Xu 
et  ah,  2008;  Lermusiaux  et  ah,  2010)  and  environmental  management  (Cos- 
sarini  et  ah,  2009).  Several  dynamical  models  are  part  of  MSEAS,  including 
a  free-surface  primitive-equation  dynamical  model  which  uses  implicit  two- 
way  nesting  (Haley  and  Lermusiaux,  2010).  This  new  multiscale  free-surface 
code  builds  on  the  primitive-equation  model  of  the  Harvard  Ocean  Predic¬ 
tion  System  (HOPS,  Haley  et  al.  (2009)).  Additionally,  barotropic  tides  are 
calculated  from  an  inverse  tidal  model  (Logoutov  and  Lermusiaux,  2008;  Lo- 


2 


goutov,  2008). 

In  the  multiscale  OA  schemes  of  MSEAS,  the  Kalman  updates  for  data 
gridding  are  carried  out  successively,  from  the  largest  scale  (uniform  mean 
prior)  to  the  smallest  scale,  using  sequential  processing  of  observations  and 
scale  separation.  In  a  two-scale  version,  a  two-staged  OA  approach  (Lermu- 
siaux,  1997,  1999)  maps  the  data  onto  oceanic  fields  in  two  steps:  the  larger 
and  the  smaller  scale  steps.  The  main  inputs  to  one  of  these  steps  are  the 
statistical  description  of  the  field  being  estimated  and  the  observational  noise 
covariance.  While  the  latter  is  dependent  on  the  measurement  sensor,  the 
knowledge  of  the  held  statistics  does  not  come  easily  in  oceanography  due 
to  the  scarcity  of  observations.  The  held  statistics  is  often  provided  by  ana¬ 
lytical  correlation  functions  which  depend  on  the  spatial  separation  distance 
and  the  spatial-temporal  scales  (Carter  and  Robinson,  1987).  Other  MSEAS 
schemes  also  utilize  4D  dynamical  models  to  construct  covariances  (Lermusi¬ 
aux  et  al.,  2000;  Lermusiaux,  2002).  These  dynamical  models  are  sometimes 
successfully  simplified  to  diffusion  models  (Lynch  and  McGillicuddy,  2001) 
and  this  approach  is  also  used  here  to  benchmark  our  new  schemes. 

Our  work  on  new  OA  methodologies  for  complex  coastal  regions  is  mo¬ 
tivated  by  the  Philippines  Straits  Dynamics  Experiment  (PhilEx,  Gordon 
et  al.  (2011)).  The  goal  of  PhilEx  is  to  enhance  understanding  of  the  oceano¬ 
graphic  processes  and  features  arising  in  and  around  straits,  and  to  improve 
the  capability  to  predict  the  inherent  spatial  and  temporal  variability  of 
complex  Archipelago  regions  using  models  and  advanced  data  assimilation 
techniques.  In  addition  to  the  Philippines,  we  have  used  our  new  schemes  in 
several  coastal  regions  with  and  without  islands,  including  the  Taiwan  region, 
New  England  shelf,  Dabob  Bay  and  Monterey  Bay  (Xu  et  al.,  2008;  Lermusi¬ 
aux  et  ah,  2010;  Haley  and  Lermusiaux,  2010).  Other  OA  schemes  have  been 
used  in  coastal  regions  (Hessler,  1984;  Stacey  et  al.,  1988;  Paris  et  al.,  2002), 
but  without  satisfying  coastline  constraints,  in  particular,  there  should  be  no 
direct  relationship  across  landforms.  In  ocean  regions  with  complex  3D  ge¬ 
ometries,  we  found  that  such  schemes  give  field  estimates  that  lead  to  major 
issues  when  used  to  initialize  simulations.  Efficient  and  accurate  method¬ 
ologies  for  held  (e.g.  temperature,  salinity,  biology,  and  velocity)  mapping 
in  complex  multiply-connected  coastal  domains  and  archipelagos  were  thus 
necessary. 

The  schemes  we  derive  estimate  the  sea  paths  between  data  and  model 
points  using  the  Level  Set  Method  (LSM)  and  Fast  Marching  Method  (FMM), 
which  are  techniques  (Sethian,  1999b)  to  evolve  boundaries  using  appropri- 
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ate  partial  differential  equations  (PDEs).  The  FMM-based  OA  methods 
are  shown  to  be  cheaper  and  more  robust  than  others,  in  particular  than 
those  based  on  solving  diffusion-based  PDEs.  We  also  study  computational 
and  mathematical  properties.  We  find  that  higher-order  discretizations  of 
the  level-set  PDEs  increase  the  accuracy  of  distance  estimates,  second-order 
schemes  being  sufficient  for  most  applications.  We  show  that  the  covariance 
matrices  are  not  necessarily  positive  definite  because  the  Weiner  Khinchin 
and  Bochner  theorems  for  positive  definiteness,  e.g.  (Papoulis,  1991),  are  only 
valid  for  convex  simply-connected  domains.  Several  approaches  to  overcome 
this  issue  are  presented  and  evaluated.  The  solutions  we  propose  include 
introducing  a  small  process  noise  or,  better,  reducing  the  covariance  matrix 
based  on  the  dominant  singular  value  decomposition. 

Our  new  methods  are  expected  to  have  many  applications,  in  particular 
to  improve  the  World  Ocean  Atlas  (WOA)  climatologies  in  complex  multiply- 
connected  domains.  The  WOA  provides  global  ocean  climatology  containing 
monthly,  seasonal  and  annual  means  of  temperature  (T)  and  salinity  (S)  fields 
at  standard  ocean  depths.  The  temperature  and  salinity  climatologies  of  the 
WOA  (Levitus,  1982),  which  is  also  termed  as  ‘Levitus  Climatology’  and  its 
atlas  updates  in  1994  (Levitus  and  Boyer,  1994;  Levitus  et  al.,  1994),  1998 
(Antonov  et  al.,  1998a, b,c;  Boyer  et  al.,  1998a, b,c),  2001  (Stephens  et  ah, 
2002;  Boyer  et  ah,  2002)  and  2005  (Locarnini  et  ah,  2006;  Antonov  et  ah, 
2006;  Garcia  et  al.,  2006a, b),  have  proven  to  be  valuable  tools  for  studying 
the  hydrographic  structures  of  the  World’s  oceans.  The  WOA  climatologies 
have  been  particularly  useful  for  providing  initial  and  boundary  conditions  to 
ocean  circulation  models.  The  OA  procedure  for  the  ‘Levitus  Climatology’ 
requires  the  use  of  an  analytical  correlation  function  to  determine  the  covari¬ 
ance  (or  weight  function,  as  described  by  Levitus  (1982)).  If  the  “straight 
Euclidean  distance”  (the  straight  line  distance  between  two  points)  is  used 
in  such  analytical  correlation  functions,  the  distance  estimate  is  inappro¬ 
priate  for  complex  multiply-connected  domains,  as  this  “straight  Euclidean 
distance”  goes  across  land  and  so  violates  all  coastlinc/bottom  constraints. 
In  particular,  unconnected  water  masses  are  then  erroneously  blended  across 
landforms,  leading  to  artificial  water  masses,  spurious  currents  and  other  fic¬ 
titious  features.  The  aim  of  our  new  methodologies  is  to  satisfy  all  geometric 
constraints  arising  in  complex  multiply-connected  domains  and  so  rectify  all 
of  these  issues. 

The  paper  is  organized  as  follows.  The  problems  addressed  are  described 
in  Sect.  2.  In  Sect.  3,  we  review  the  two  staged  multi-scale  statistical 
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field  mapping  approach  from  MSEAS.  In  Sect.  4,  we  introduce  the  new 
OA  methodologies  based  on  the  Level  Set  Method  and  the  Fast  Marching 
Method.  An  optimization  approach  for  computing  the  transport  streamfunc- 
tion  and  total  velocity  under  geostrophic  balance  by  minimizing  the  unknown 
inter-island  transports  is  also  discussed.  The  OA  approach  based  on  the 
stochastically  forced  partial  differential  equations  (SPDE)  is  introduced  in 
Sect.  5.  In  Sect.  6,  applications  of  our  new  methodologies,  for  the  complex 
regions  of  Dabob  Bay  and  Philippines  Archipelago  are  presented.  In  Sect.  7, 
we  study  the  computational  properties  of  our  new  mapping  schemes.  Sec¬ 
tion  8  consists  of  a  summary  and  conclusions.  The  scheme  to  compute  the 
‘Levitus  Climatology’  maps  is  summarized  in  App.  A,  the  FMM  algorithm 
in  App.  B  and  the  algorithm  for  minimizing  unknown  inter-island  transports 
in  App.  C. 

2.  Problem  Statement 

We  begin  by  introducing  the  definitions  of  convex  domains,  simply  and 
multiply  connected  domains.  A  domain  is  said  to  be  convex  if  for  every 
pair  of  points  within  the  domain,  every  point  on  the  straight  line  segment 
that  joins  them  is  also  within  the  domain.  A  domain  is  said  to  be  simply- 
connected  if  any  closed  curve  within  it  can  be  continuously  shrunk  to  a  point 
without  leaving  the  domain.  A  domain  which  is  not  simply-connected  is 
called  multiply-connected. 

A  main  research  question  of  this  work  is  held  mapping  via  OAs  in  com¬ 
plex  multiply-connected  coastal  domains.  OA  schemes  require  a  description 
of  held  statistics  which  is  often  provided  by  analytical  correlation  functions 
(Carter  and  Robinson,  1987;  Lam  et  al.,  2009).  Such  analytical  correlation 
functions  are  dependent  on  the  spatial  separation  distance.  Using  “straight 
Euclidean”  distances  in  complex  multiply-connected  domains  is  not  appro¬ 
priate  since  there  is  no  direct  relationship  across  landfornrs.  An  appropriate 
measure  of  distance  should  be  longer.  The  most  straightforward  is  the  length 
of  the  shortest  sea  path  i.e.,  the  shortest  path  without  going  across  complex 
landfornrs.  Examples  of  such  paths  that  we  computed  for  the  Monterey  Bay, 
Massachusetts  Bay,  Dabob  Bay  and  Philippines  Archipelago  are  illustrated 
in  Fig.  1.  Our  new  methodology  measures  these  distances  most  efficiently. 
It  also  allows  altering  distances  to  account  for  dynamical  or  other  effects. 
For  example,  our  estimation  of  3D  shortest  sea  paths  can  be  set-up  such 
that  vertical  distances  are  weighted  more  than  horizontal  ones,  hence  ac- 
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counting  for  effects  of  reduced  correlations  across  depths.  In  general,  any 
coordinate  system  can  be  used:  if  instead  of  depth,  density  surfaces  are  em¬ 
ployed,  diapycnal  paths  can  be  weighted  more  than  isopycnal  paths.  All  of 
these  generalizations  of  the  shortest  sea  path,  as  well  as  correlation  functions 
that  are  constrained  by  dynamical  or  feature-based  considerations,  can  be 
easily  accommodated  in  our  new  OA  methodology. 

The  physical  shortest  sea  paths,  or  any  generalization  of  such  paths,  in 
complex  multiply-connected  regions  can  be  efficiently  obtained  using  the 
following  numerical  techniques:  the  Level  set  method  (LSM)  (Osher  and 
Sethian,  1988;  Sethian,  1999b)  and  the  Fast  Marching  Method  (FMM)  (Sethian, 
1996,  1999b).  These  methods  model  the  propagation  of  evolving  boundaries 
using  appropriate  PDE’s.  Here,  we  illustrate  their  applications  for  realistic 
OAs  in  both  the  Philippines  Archipelago  and  Dabob  Bay  (WA,  USA)  regions. 
Other  optimization  methods  for  path  planning,  for  example  Dijkstra’s  algo¬ 
rithm  (Bertsimas  and  Tsitsiklis,  1997)  and  Bresenham-based  line  algorithm 
(Bresenham,  1965)  could  also  be  used  for  mapping  in  complex  domains,  but 
we  find  and  show  that  the  FMM  and  LSM  schemes  are  computationally  more 
efficient  and  more  accurate.  We  also  compare  our  results  to  the  OA  approach 
based  on  solving  stochastically  forced  PDEs  (Balgovind  et  ah,  1983;  Lynch 
and  McGillic-uddy,  2001). 

The  FMM  and  LSM  can  also  be  utilized  for  estimating  the  minimum 
vertical  area  along  any  path  between  two  islands.  The  advantage  of  the 
FMM  and  LSM  is  that  this  can  be  efficiently  computed  for  all  island  pairs 
in  complex  domains  with  many  islands.  Such  areas  are  needed  to  estimate 
total  velocities  and  transports  under  a  geostrophic  constraint  (Wunsch,  1996) 
with  our  hydrographic  OAs.  Specifically  in  our  case,  these  vertical  areas 
are  used  in  the  inversion  for  the  transport  streamfunction  along  the  island 
coastlines.  The  resulting  temperature,  salinity  and  velocity  held  estimates 
can  then  be  used  as  first-guess  in  3D  mapping  of  primitive-equation  fields 
and  error  covariances  (Lermusiaux  et  ah,  2000;  Lermusiaux,  2002). 

Mathematical  and  computational  properties  of  the  new  mapping  schemes 
are  also  investigated  in  detail.  To  reduce  the  computational  cost  and  to 
understand  the  impact  of  individual  data,  sequential  processing  of  observa¬ 
tions  (Parrish  and  Cohn,  1985;  Clio  et  al.,  1996)  is  utilized.  By  definition,  the 
prior  covariance  matrix  should  be  positive  definite.  According  to  the  Wiener- 
Khinchin  and  Bochner  theorem  (Papoulis,  1991;  Yaglom,  2004;  Dolloff  et  ah, 
2006),  a  covariance  matrix  based  on  an  analytical  correlation  function  will 
be  positive  definite  if  the  Fourier  transform  (or  the  spectral  density  of  the 
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correlation  function)  is  non-negative  for  all  frequencies.  These  theorems  are 
valid  only  for  convex  simply-connected  domains.  In  our  complex  multiply- 
connected  domains,  the  covariance  matrix  may  become  negative  due  to:  a) 
Numerical  errors  in  the  computation  of  the  shortest  sea  path’s  length  using 
our  new  FMM/LSM  based  schemes,  or,  b)  The  presence  of  landforms.  These 
issues  may  lead  to  divergence  problems  (Brown  and  Hwang,  1997)  in  the 
held  mapping.  Therefore,  the  following  two  questions  were  investigated  and 
resolved:  a)  What  are  the  computational  errors  in  the  sea  path  lengths  com¬ 
puted  using  the  FMM/LSM  and  how  can  they  be  reduced?,  and,  b)  What 
are  the  computational  issues  including  non-positive  definite  covariances  that 
arise  in  mapping  data  in  a  multiply-connected  coastal  domain  and  how  can 
they  be  remedied?  Answering  these  questions  was  indispensable  for  the  de¬ 
velopment  of  the  FMM/LSM  based  scheme  for  complex  multiply-connected 
domains. 

3.  MSEAS  Objective  Analysis  Approach 

Bayesian-based  OA  schemes  are  well  established  for  mapping  heteroge¬ 
neous,  multivariate,  irregular  data  (Gandin,  1965;  Bretherton  et  al.,  1976; 
Carter  and  Robinson,  1987;  Daley,  1993)  in  open  oceans,  without  islands  or 
archipelagos,  as  well  as  in  atmospheric  sciences.  Most  OA  schemes  utilize 
the  Gauss-Markov  or  minimum  error  variance  criterion  (Plackett,  1950)  to 
map  observations  to  the  numerical  grid  and  they  require  the  computation 
of  Euclidean  distances  between  all  data  and  model  points.  Within  MSEAS, 
our  multi-scale  OA  scheme  consists  of  the  successive  utilization  of  Kalman 
update  steps,  one  for  each  scale  and  for  each  correlation  across  scales  (Ler- 
musiaux  et  ah,  2000;  Lermusiaux,  2002).  In  particular,  our  two-scale  OA 
version  is  summarized  in  (Lermusiaux,  1997,  1999). 

Considering  one  scale  or  one  interaction  between  two  scales,  let  us  denote 
the  vector  of  numerical  grid  point  locations  as  x  and  the  vector  of  measure¬ 
ment  locations  as  X,  then  the  OA  estimate  of  the  held  for  that  scale  or 
interaction  (0OA)  based  on  the  latest  background  held  (0,d)  is  given  by: 

0OA  =  -0  +  (7or(x,  X)[C'or(X,  X)  +  R]-1[d  —  d] 

=  0  +  KOA[d  —  d]  (1) 

where  Cor(x,  X)  is  the  correlation  matrix  between  grid  and  data  points  (for 
multivariate  OAs  or  3D  OAs,  it  is  a  normalized  covariance  matrix,  see  Ler¬ 
musiaux  (2002)),  d  =  H0,  H  is  the  observation  matrix,  d  is  the  sensor  data 


7 


vector,  R  is  the  error  covariance  matrix  for  the  sensor  data  d  (for  the  scale 
considered)  at  data  points,  and  the  gain  KOA  is  given  by: 

Koa  =  Cor(x,  X)[Cor(X,  X)  +  R]”1  (2) 

The  error  covariance  of  the  estimated  held  (for  one  scale)  is  then  given  by 
(where  E[]  denotes  the  expectation  operator): 

POA  =  E[(x-E[X])(x-E[x])t] 

=  Cor(x,  x)  —  KOACor(X,  x).  (3) 

A  comparison  between  our  above  update  equations  for  the  OA  for  one  scale 
and  the  Kalman  filter  (KF)  update  equations  (using  underscore  t  to  indicate 
time  t  is  made  in  Table  1). 


KF  Update  Equations 
hline  Kalman  gain: 

Kt=Pt|t-1HtTx[HtPt|t_1HtT+Rt  T1 

MSEAS  OA  Update  equations 

OA  Gain: 

K  =  Cor(x,  X)  x  [CW(X,  X)  +  R]-1 

State  estimate  update: 

xt  =  xt|t— i  +  Kt(yt  -  Htxt|t-i) 

State  estimate  update: 
i/jOA  =  $  +  K[d-d]. 

Error  covariance  update: 

Pt  =  (I  —  KtHt)Pt|t-1 

Error  covariance  update: 

pOA  =  C'or(x,x)-KOA  x  Cor(X,x) 

Table  1:  Comparison  between  the  Kalman  Filter  and  the  MSEAS  OA  update 
equations  (for  a  univariate  variable  and  one  scale). 


Thus,  if  covariances  in  time  are  not  considered,  the  update  equations  of 
the  OA  of  one  scale  are  equivalent  to  the  update  equations  of  the  discrete 
Kalman  filter  algorithm.  The  background  error  correlation  matrix  for  the 
hcld-to-data  points,  Cor(x,X),  and  the  background  correlation  matrix  at  the 
data  points,  Cor(X,X),  are  directly  related  to  the  KF  a  priori  error  covari¬ 
ance  matrix  Pt|t_i  i.e.  Cor(x,X)  =  Pqt-iH)1’  and  Cor(X,X)  =  HtPt|t_1H^’ 
(Ht  is  the  observation  matrix).  In  2D  horizontal  OAs  for  a  single  variable, 
the  matrix  R  is  often  chosen  diagonal  with  a  uniform  non-dimensional  obser¬ 
vational  error  variance  <r^,  i.e.  R  =  ajl.  In  MSEAS,  the  correlation  matrices 
for  a  given  scale  are  usually  generated  from  the  isotropic  function: 


(  r 2  \ 

.  r2  At2' 

{  Wexp 

0-5  X  ^ L2e  +  r2  ^ 
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Cor(r ) 


(4) 


Here,  At  is  the  difference  between  the  observation  and  the  estimation  time 
and  r  is  the  decorrelation  time  scale.  This  correlation  in  time  effect  extends 
the  direct  Kalman  update  step  at  a  single  time  to  a  smoothing  OA  step 
using  data  from  different  but  synoptic  times.  The  parameters  L0  and  Le  are 
the  zero-crossing  and  the  e-folding  length  scales.  The  scalar  r  is  the  spatial 
separation  distance. 

The  MSEAS  OAs  are  often  carried  out  in  two  stages  (Lermusiaux,  1999). 
In  the  first  stage,  the  largest  dynamical  scales  (denoted  LS)  are  mapped  onto 
the  computational  grid  using  the  parameters  (r,  L0,  Le) ls-  The  background 
held  for  this  stage  is  often  chosen  to  be  equal  to  the  horizontal  mean  of  all  the 
observations.  In  the  second  stage,  the  smaller  scales  are  mapped  using  the 
coefficients  (r,  L0,  Le) me  often  corresponding  to  the  most  energetic  (meso) 
scales.  The  background  held  for  this  stage  is  the  hrst  stage  OA.  A  major 
assumption  in  this  OA  approach  is  that  the  errors  in  the  largest  and  the  most 
energetic  stages  are  statistically  independent.  A  3D  and  dynamics-based 
extension  of  this  approach,  including  multiscale  interactions,  is  presented  in 
(Lermusiaux  et  al.,  2000;  Lermusiaux,  2002):  this  3D  multiscale  approach 
also  benefits  from  our  new  efficient  estimation  of  shortest  sea  paths.  Of 
course,  the  accuracy  of  the  held  estimates  also  depends  on  the  spatial  and 
time  scale  parameters  used  in  the  analytical  correlation  function,  as  well  as 
on  the  correlation  function  itself.  The  2D  horizontal  version  of  the  MSEAS 
OA  has  many  similarities  with  the  approach  used  for  ‘Levitus  Climatology’ 
maps  (Levitus,  1982;  Locarnini  et  al.,  2006;  Antonov  et  ah,  2006;  Garcia 
et  al.,  2006a, b)  which  is  described  in  App.  A. 

The  ‘Levitus  Climatology’  and  above  two-scale  MSEAS  OA  mapping 
schemes  compute  the  covariance  or  weight  factors  by  providing  Euclidean 
distances  as  inputs  to  correlation  functions.  If  they  are  not  employed  in 
open  ocean  conditions,  actual  sea  distances  between  data  and  model  points 
without  going  across  complex  landforms  or  through  bathymetry  are  needed. 
The  LSM  or  FMM  presented  next  in  Sect.  4  are  used  to  obtain  such  short¬ 
est  sea  distances  between  any  two  points  in  a  complex  (e.g.  multi-island) 
multiply-connected  coastal  region. 

4.  Methodologies  for  estimating  the  length  of  shortest  sea  paths  in 

complex  coastal  regions  and  archipelagos 

The  shortest  sea  paths  between  data  and  model-grid  points  in  complex 
multiply-connected  coastal  regions  are  efficiently  computed  using  the  LSM 
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and  FMM.  These  paths  are  then  input  to  our  MSEAS  software  for  multi¬ 
scale  OAs.  The  LSM  and  FMM  methods  are  both  more  accurate  and  com¬ 
putationally  cheaper  than  the  conventional  Bresenham-based  line  algorithm 
(Bresenham,  1965)  and  Dijkstra’s  algorithm  (Bertsimas  and  Tsitsiklis,  1997). 
Comparisons  to  these  other  algorithms  are  discussed  in  (Agarwal,  2009),  key 
results  are  summarized  in  Sect.  7.1. 


4-1.  Objective  Analysis  using  the  Level  Set  Method  (LSM) 

A  level  set  of  a  real- valued  function  </>  of  n  variables  is  a  set  of  the  form: 

{(xi,  ...,xn)\(f>(xi, ...,  xn)  =  c}  (5) 

where  c  is  a  constant  and  Xi  are  the  n  variables.  That  is,  a  level  set  is  the 
set  of  points  where  the  function  (j)  takes  on  a  given  constant  value  c. 

Osher  and  Sethian  (1988)  proposed  a  numerical  technique,  which  is  called 
the  Level  Set  Method,  to  implicitly  represent  and  model  the  propagation  of 
evolving  level  set  interfaces  under  the  influence  of  a  given  velocity  field  using 
appropriate  PDEs.  An  initial  value  formulation  describing  the  interface  mo¬ 
tion  is  now  discussed.  The  initial  position  of  interfaces  are  given  by  level  sets 
of  the  function  (f>.  The  evolution  of  this  function  (j)  is  linked  to  the  propaga¬ 
tion  of  the  interface  through  a  time-dependent  level  set  equation.  Interfaces 
can  be  represented  explicitly  (parametrized  interfaces  i.e.  interfaces  given  by 
x  =  x(s),  where  s  is  the  parameter)  or  implicitly  (e.g.  interfaces  given  by  the 
zero  level  set  i.e.  0(x)  =  0).  Using  the  implicit  representation  </>(x),  where  x 
is  the  position  vector,  a  convection  equation  can  be  solved  to  propagate  level 
sets  advected  by  a  velocity  field  v: 

0t  +  v.V0  =  0  (6) 


In  many  cases,  one  is  interested  only  in  the  motion  normal  to  the  boundary. 
Therefore,  the  velocity  v  can  be  represented  using  the  scalar  speed  function 
F  and  the  normal  direction  n.  Thus. 


v  =  F  n 


IW| 


(7) 


The  hyperbolic,  non-linear  (Hamilton-Jacobi  equation)  level  set  equation, 
obtained  from  Eqns.  6  and  7,  is  given  by: 


4>t  +  F\ V0|  =  0 


(8) 
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Integrating  the  level  set  equation  is  an  initial  value  problem  which  tracks  the 
evolution  of  the  level  sets  0=constant  assuming  F  is  given  by  the  specifics  of 
the  evolution  of  the  <f>  for  a  particular  problem.  The  following  first  order  up- 
winded  finite  difference  approximation  can  be  used  to  integrate  this  equation 
8  (2- dimensional  in  space)  (Osher  and  Sethian,  1988;  Sethian,  1999b): 

C+1  =  ~  &t[max(F,  0)  V+  +  min(F,  0)  V~-] 


where, 


Vy  =  [max{D~x4>™ji  0)2  +  min{D+x(f)F ,  0)2  + 
max(D~y 0)2  +  min(D+y 0T,  0)2]1/2 
V; ;  =  [min(D~x 0)2  +  max{D+x4>^ ,  0)2  + 

min(D~y 0)2  +  max(D+y 0T,  0)2]1/2  (9) 


Here,  D“x  is  the  first  order  backward  difference  operator  in  the  x-direction; 
D+x  is  the  first  order  forward  difference  operator  in  the  x-direction,  etc. 
Mathematically,  these  operators  are  given  by: 


D~ 


Ax 


D 


+X, 


4*1+1,! 


'h3 


Ax 


(10) 


The  above  numerical  technique  of  the  Level  Set  Method  can  be  used  to 
solve  the  Eikonal  Equation  as  described  next.  If  the  scalar  speed  function  of 
the  front  F  is  non-negative,  then  the  steady  state  boundary  value  problem, 
known  as  the  Eikonal  equation,  can  be  formulated  to  evaluate  the  arrival 
time  function  T(x).  The  Eikonal  equation  representing  the  time  T(x)  for 
the  “frontal  interface”  to  reach  the  position  x  from  its  initial  position  is 
given  by: 


F|VT|  =  1  (11) 

The  Eikonal  equation  simply  states  that  the  gradient  of  the  arrival  time 
function  is  inversely  proportional  to  the  local  speed  of  the  front.  To  solve  the 
Eikonal  equation,  a  time  dependent  problem  is  proposed.  The  time  evolved 
steady  state  solution  of  the  resultant  Hamilton-Jacobi  equation  is  the  Eikonal 
equation.  Mathematically,  this  is  written  as: 

Tt  +  F|VT|  =  1  st-^iv  f\VT\  =  1  (12) 
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This  Hamilton- Jacobi  equation  (Eqn.  12  (Left))  can  be  discretized  using  the 
numerical  scheme  for  the  Level  Set  equation.  The  steady  state  solution  of 
this  Hamilton-Jacobi  equation  will  be  the  solution  of  the  Eikonal  equation 
(Eqn.  12  (Right)). 

The  Level  Set  Method  has  been  used  in  a  wide  variety  of  applications 
which  include  arrival  time  problems  in  control  theory,  generation  of  minimal 
surfaces,  flame  propagation,  fluid  interfaces,  shape  reconstruction  etc.  In 
the  oceanic  context,  the  method  can  be  used  to  determine  shortest  sea  path 
lengths  as  follows.  The  scalar  speed  function  F  is  set  to  0  for  the  grid  points 
on  land  and  1  for  the  grid  points  on  water.  The  level  set  T(x),  which  is 
the  arrival  time  function,  then  also  represents  the  shortest  sea  distance  from 
the  starting  position  to  the  position  vector  x.  This  is  because  the  level  set 
T,  which  is  the  arrival  time,  when  multiplied  by  the  local  speed  of  the  front 
(equal  to  1  in  this  case)  gives  the  level  set  T  itself  for  the  shortest  sea  path 
length  estimate.  Once  these  sea  distances  between  all  data  points  and  model 
points  are  available,  the  prior  correlation  functions  can  be  evaluated  and  the 
correlation  matrices  filled  in  Eqn.  1.  An  OA  can  then  be  computed. 

Operation  count  for  the  LSM:  The  computation  of  shortest  sea  paths  via 
the  LSM  requires  evolving  of  all  the  level  sets  in  Eqn.  8  and  not  simply 
the  zero  level  set  corresponding  to  the  front  itself.  The  LSM  thus  has  an 
operation  count  of  0(N3)  in  two  dimensions  for  N2  grid  points  (Sethian, 
1999b).  It  is  computationally  expensive  since  an  extra  dimension  is  added 
to  the  problem. 

A  modified  approach  named  ‘Fast  Marching  level  set  method’,  which  sig¬ 
nificantly  reduces  the  operation  count,  is  described  next.  Roughly  speaking, 
the  two  possible  ways  to  obtain  steady-state  are  either  iteration  towards  the 
solution,  or  direct  construction  of  the  stationary  solution  T.  While  LSM  con¬ 
structs  the  solution  to  the  Eikonal  Eqn.  11  by  iterating  towards  the  solution, 
FMM  is  based  on  direct  construction  of  the  stationary  solution  T. 

4-2.  Objective  Analysis  using  the  Fast  Marching  Method  (FMM) 

The  Fast  Marching  Method  (FMM)  for  monotonically  advancing  fronts 
has  been  proposed  by  Sethian  (1996,  1999b).  This  method  leads  to  a  very 
fast  scheme  for  solving  the  Eikonal  Eqn.  11.  The  Level  set  method  relics 
on  computing  the  evolution  of  all  the  level  sets  by  solving  an  initial  value 
PDE  using  numerical  techniques  from  hyperbolic  conservation  laws.  This 
is  because  the  Level  Set  Method  iteratively  solves  the  level  set  equation  to 
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compute  the  steady  state  solution  which  corresponds  to  the  solution  of  the 
Eikonal  equation  (Eqn.  12).  As  an  alternative,  an  efficient  modification  is 
to  perform  the  work  only  in  the  neighborhood  of  the  zero  level  set,  as  this 
is  known  as  the  ‘narrow  band  approach’.  The  basic  idea  is  to  tag  the  grid 
points  as  either  “alive”,  “land  mines”  or  “far  away”  depending  on  whether 
they  are  inside  the  band,  near  its  boundary,  or  outside  the  band,  respectively. 
The  work  is  performed  only  on  alive  points,  and  the  band  is  reconstructed 
once  the  land  mine  points  are  reached. 

The  FMM  solves  boundary  value  problems  without  iterations.  The  method 
is  applicable  to  monotonically  advancing  fronts  (i.e.  the  front  speed  (F  >  0 
or  F  <  0  )  which  are  governed  by  the  level  set  equation  (Eqn.  12).  The 
steady  state  form  of  the  level  set  equation  is  the  Eikonal  equation  (Eqn.  11) 
which  says  that  the  gradient  of  the  arrival  time  surface  is  inversely  propor¬ 
tional  to  the  speed  of  the  front.  For  the  two  dimensional  case,  the  stationary 
boundary  value  problem  is  given  by: 

\VT\F(x,y)  =  l  s.t.  T  =  {{x,y)\T{x,y)  =  0}  (13) 

where  T  is  the  starting  position  of  the  interface.  The  first  order  finite  differ¬ 
ence  discretization  form  of  the  Eikonal  equation  (Sethian,  1999b)  at  the  grid 
point  (i,j)  is  given  by: 

jnax(I):j'T.  0)2  +  min(D±xT ,  0)2  + 
max{D~yT ,  0)2  +  min{D+yT ,  0)2]1/2  = 

or, 

[max(max(D~jxT,  0),  —min(D^xT,  0))2  + 
max(max(D^yT,  0),  —  min(D^yT,  0))2]  =  (14) 

Equation  14  is  essentially  a  quadratic  equation  for  the  value  at  each  grid  point 
(assuming  that  values  at  the  neighboring  nodes  are  known).  An  iterative 
algorithm  for  computing  the  solution  to  Eqn.  14  was  introduced  by  Ruoy  and 
Tourin  (1992).  FMM  is  based  on  the  observation  that  the  upwind  difference 
structure  of  Eqn.  14  means  that  the  information  propagates  “one  way”,  i.e. 
from  the  smaller  values  of  T  to  the  larger  values.  Therefore,  FMM  rests 
on  solving  Eqn.  14  by  building  the  solution  outward  from  the  smallest  time 
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value  T.  The  front  is  swept  ahead  in  an  upwind  manner  by  considering  a  set 
of  points  in  a  narrow  band  around  the  existing  front  and  bringing  new  points 
into  the  narrow  band  structure.  The  fast  marching  algorithm  is  discussed  in 
detail  in  App.  B  (see  also  (Agarwal,  2009)). 

The  use  of  higher-order  FMMs  (or  LSMs)  to  reduce  errors  in  the  estima¬ 
tion  of  shortest  sea  path  lengths  is  discussed  in  Sect.  7.2.  They  are  compu¬ 
tationally  more  expensive  but  can  be  necessary  for  robust  and  accurate  OAs 
because  in  complex  multiply-connected  domains,  we  found  that  covariance 
matrices  were  sensitive  to  the  accuracy  of  these  lengths.  These  findings  are 
discussed  later  in  Sections  7.2  and  7.3. 

Operation  count  for  the  FMM:  Once  again,  for  estimating  the  optimal 
distance,  the  scalar  speed  function  F  is  set  to  0  for  the  grid  points  on  land 
and  1  for  the  grid  points  on  water.  However,  the  FMM  has  a  significantly 
lower  operation  count  of  0(N2  Log  N)  for  N2  grid  points  (Sethian,  1999b). 
It  is  computationally  much  cheaper  than  the  LSM  explained  above. 

The  Fast  Marching  Method,  as  discussed  above,  is  an  efficient  way  to 
compute  the  sea  distance  between  any  two  locations.  These  sea  distances 
can  then  be  used  for  setting  up  the  covariance  matrix  using  any  distance- 
dependent  analytical  correlation  function  (e.g.  Eqn.  4).  Note  that  the  cost 
of  the  OAs  proper  are  the  same  for  both  the  LSM  and  FMM. 

4-  3.  Total  velocity  under  geostrophic  balance:  Estimating  the  minimum  ver¬ 
tical  area  in  complex  coastal  regions  and  archipelagos 

Classically,  the  synoptic  ocean  data  that  are  most  abundant  are  hydro- 
graphic  (temperature  and  salinity)  measurements.  If  these  data  are  first 
gridded  by  OAs,  they  can  be  used  to  estimate  a  velocity  field  under  the  con¬ 
straint  of  geostrophic  shear  (Wunsch,  1996)  or  other  momentum  balance  as¬ 
sumptions  including  full  momentum  conservation  of  the  primitive-equations 
(Lermusiaux  et  ah,  2000;  Lermusiaux,  2002).  If  geostrophic  shear  is  used  as 
the  constraint,  to  compute  transport  estimates  from  the  hydrographic  OAs, 
a  reference  velocity  is  required.  In  complex  domains,  an  estimate  of  the  area 
of  the  sea  cross-sections  between  any  two  landforms  (e.g.  islands)  is  also 
often  necessary  to  set  the  inter-islands  transports.  The  FMM  can  be  directly 
used  to  compute  the  minimum  of  these  cross-sectional  areas. 

In  our  case,  we  utilize  an  optimization  scheme  to  estimate  these  inter¬ 
island  transports,  see  (Haley  et  ah,  2011;  MSEAS,  2010).  The  scheme  is 
summarized  in  App.  C.  Its  objective  is  to  find  a  set  of  values  for  the  trans¬ 
port  streamfunction  (T)  along  the  island  coastlines  that  produce  a  suitably 
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smooth  (initial)  velocity  field,  e.g.  without  unrealistic  velocities.  If  prior 
estimates  of  specific  transports  between  islands  are  known,  they  are  utilized 
with  their  uncertainties  as  inputs  to  the  optimization  scheme.  If  such  prior 
estimates  are  not  available,  they  are  set  using  a  minimum  energy  principle: 
a  norm  of  the  total  velocity  between  the  corresponding  islands  is  minimized 
under  the  constraint  of  geostrophic  velocity  shear  balancing  the  hydrographic 
OA  maps.  To  do  so,  the  weight  functions  require  an  estimate  of  the  cross- 
sectional  area  between  islands.  This  is  not  easy  to  compute  exactly  without 
a  FMM/LSM  approach. 

With  the  FMM/LSM  schemes,  the  minimum  vertical  area  can  be  obtained 
if  we  solve  the  Eikonal  Eqn.  11  setting  the  scalar  speed  function  to  be  F(x,y) 
=  1/H(x,y).  The  Eikonal  equation  thus  simplifies  to  |VTj  =  H,  which  shows 
that  the  solution  T(x,y)  of  this  Eikonal  equation  will  be  the  minimum  vertical 
area.  This  new  approach  is  used  in  Sect.  6  to  obtain  velocity  estimates  from 
our  hydrographic  FMM-based  OA  maps. 

5.  Objective  Analysis  using  stochastically  forced  partial  differential 
equations  (SPDE’s) 

Another  OA  approach  that  accounts  for  landforms  is  based  on  using 
SPDE’s.  The  central  idea  is  to  represent  the  underlying  field  variability 
as  an  outcome  of  a  stochastic  process  using  a  SPDE  where  the  stochasticity 
represents  the  uncertainty  in  this  differential  equation.  The  SPDE  is  defined 
only  over  the  sea  domain  so  as  to  account  for  geometric  constraints.  The 
covariance  matrix  for  the  field  is  then  constructed  numerically,  by  solving  a 
set  of  SPDEs  over  the  sea  domain.  For  example,  the  stochastically  forced 
Helmholtz  equations  in  1-D  and  2-D  in  space  for  the  field  0  in  an  unbounded 
domain  (Balgovind  et  ah,  1983)  are  associated  with  the  following  covariance 
functions  respectively: 

^Jr  -  =  e(x)  oC^(r)  =  (1  +  fcr)e(-fcr) 

V20  —  /c20  =  e(x,  y )  C^(r)  =  krK^kr) 


,  kr  — >  oo  (15) 

where,  is  the  Bessel  function  of  the  second  kind.  The  process  noise  e 
is  a  random  disturbance  with  mean  0,  standard  deviation  1  and  no  spatial 
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correlation.  Also,  the  length  scale  corresponds  to  the  inverse  of  the  SPDE  pa¬ 
rameter  (k).  Denman  and  Freeland  (1985)  and  Weaver  and  Courtier  (2001) 
have  proposed  other  correlation  functions  which  can  also  be  linked  to  appro¬ 
priate  SPDE’s. 

A  major  advantage  of  this  SPDE  approach  is  that  the  £eld-to-£eld  co- 
variance  Cor(x.x)  can  be  computed  numerically  from  the  discretized  SPDE 
along  with  appropriate  boundary  conditions  (i.e.  no  flux  boundary  condi¬ 
tion  across  islands)  to  directly  account  for  the  coastline  constraints  (Lynch 
and  McGillicuddy,  2001).  The  discretization  of  SPDEs  (such  as  Eqn.  15)  or 
any  other  differential  operator  defined  on  the  sea  domain  usually  amounts  to 
solving  a  matrix  equation  of  the  form: 

[A\W  =  {e}  (16) 

where  {e}  is  the  spatial  discretization  of  the  process  noise  e.  All  the  coastline 
constraints  are  then  incorporated  automatically  in  this  matrix  form  (16). 
Since  [Cee\  =  [/],  the  covariance  matrices  for  ficld-to-field  points  and  field- 
to-data  points  are  directly  obtained  from  Eqn.  16: 

Cor  fax)  =  [A]~1[Cee][A]~T  =  ([AflA])-1 
Cor  (it,  X)  =  [A]~1[Cee][A]~T[H]T  =  ([A]T[A])~1[H]T  (17) 

The  covariance  matrix  (17)  obtained  using  the  SPDE  approach  can  be  used 
along  with  Gauss-Markov  Estimation  theory  (see  Table  1)  to  perform  OAs  in 
coastal  regions.  A  limitation  of  this  approach  is  that  the  resulting  fields  can 
be  affected  by  the  discretization  error  associated  with  the  discretized  form  of 
the  SPDE.  In  fact,  we  found  that  we  often  need  to  postprocess  (smooth  out) 
the  SPDE-gridded  fields  to  remove  spurious  field  gradients.  Such  gradients, 
even  when  small,  can  lead  to  spurious  velocities  by  aggregate  integration  in 
the  vertical  for  the  estimation  of  total  velocity  under  geostrophic  balance. 
It  has  also  been  verified  that  that  the  SPDE  approach  is  computationally 
expensive  when  compared  to  our  new  FMM-based  methodology. 

A  similar  variant  of  the  above  methodology  represents  the  covariance 
function  (C'yy),  instead  of  the  field  (?/?),  by  a  SPDE,  e.g.  a  stochastic 
Helmholtz  equation  (Logutov,  personal  communication).  The  advantage  is 
that  the  covariances  required  are  then  computed  directly,  without  the  need 
of  Eqns.  17,  which  is  much  cheaper.  However,  the  noise  in  the  resulting  OA 
fields  are  then  found  to  be  even  larger  (Agarwal,  2009).  An  heuristic  reason  is 
that  this  simpler  representation  corresponds  to  carrying  out  a  “smoothing” 
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step  using  the  Helmholtz  equation  only  once  as  compared  to  twice  in  the 
original  representation  (Eqn.  17).  Both  of  these  methods,  the  SPDE  speci¬ 
fied  for  the  held  (^)  and  the  SPDE  specified  for  the  covariance  (C'^)  were 
implemented.  They  are  utilized  for  comparisons  with  onr  new  LSM-based 
and  FMM-based  schemes. 

Even  though  many  different  SPDE’s  could  be  utilized  for  mapping  a  field, 
in  our  examples,  we  selected  the  stochastically  forced  Helmholtz  equation  for 
three  reasons.  First,  the  dynamics  of  the  atmosphere  can  be  approximately 
governed  on  the  time  scale  of  a  few  days  by  a  Helmholtz-like  equation,  which 
is  the  equation  for  the  conservation  of  potential  vorticity  under  the  assump¬ 
tions  of  a  quasi-geostrophic,  frictionless,  shallow  water  model  without  topog¬ 
raphy  (Balgovind  et  ah,  1983;  Pedlosky,  1987).  Second,  a  Helmholtz  equa¬ 
tion  can  be  obtained  from  the  diffusion  or  wave  equations  and  background 
correlations  are  seldom  modeled  as  Gaussian,  by  solving  a  pseudo-diffusion 
equation  (Derber  and  Bouttier,  1999).  In  these  linear  PDE’s,  if  the  solution 
is  assumed  separable  in  time  and  space,  one  obtains  for  the  time  variation 
an  ordinary  differential  equation  of  the  first  order.  For  the  spatial  variations, 
one  always  obtains  a  Helmholtz  equation  (Selvadurai,  2000),  which  is  the 
equation  that  would  be  used  for  spatial  mapping.  Thirdly,  the  Helmholtz 
equation  is  equivalent  to  a  steady  diffusion-reaction  equation. 

Operation  count  for  the  SPDE-based  OAs:  The  cost  of  the  SPDE 
computations  of  covariances  with  one  data  point  is  at  least  in  N2n  but  most 
likely  in  N 3  where  n  is  the  number  of  time-steps  to  reach  steady  state. 

Meaningful  comparisons  among  the  different  methods  require  comparable 
covariance  parameters.  Specifically,  for  our  SPDE-based  OA  examples  using 
Eqn.  15,  the  SPDE  parameter  (k)  is  chosen  such  that  the  correlation  function 
corresponding  to  the  stochastically  forced  Helmholtz  equation  best  fits  the 
analytical  correlation  function  used  by  our  standard  OA  scheme  and  by  our 
new  LSM  or  FMM-based  schemes,  see  Sect.  4  and  (Agarwal,  2009).  The 
results  of  these  methods  can  then  be  compared  to  each  other.  This  is  done 
next  in  Sect.  6.2  using  the  World  Ocean  Atlas-2005  data  in  the  Philippines 
Archipelago  region. 

6.  Applications  illustrating  the  novel  OA  methodologies 

Methodologies  for  OAs  in  complex  multiply-connected  coastal  regions 
were  derived  and  described  in  Sect.  4.  These  methodologies  are  based  on 
computing  optimal  sea  path  lengths  using  the  Level  Set  Method  (LSM)  and 
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the  Fast  Marching  Method  (FMM).  They  efficiently  incorporate  all  geometri¬ 
cal  constraints  (e.g.  there  is  no  direct  relationship  across  landforms)  but  also 
other  generalized  constraints  (see  Sect.  7).  They  are  utilized  next  to  map  tem¬ 
perature,  salinity  and  biological  (chlorophyll)  fields  using  a  2-staged  mapping 
scheme  in  the  following  regions:  Dabob  Bay  and  Philippines  Archipelago.  For 
other  regions,  we  refer  to  Agarwal  (2009). 

Section  6.1  evaluates  our  new  schemes  in  Dabob  Bay  and  shows  that 
they  are  more  effective  than  other  classic  distance  optimizing  algorithms 
such  as  the  Bresenham-based  line  algorithm  (Bresenham,  1965).  Section 
6.2  compares  the  different  methods  introduced  in  Sects.  4  and  5  for  OAs  in 
the  Philippines  Archipelago  region.  The  estimation  of  total  velocity  under 
geostrophic  balance  by  minimizing  unknown  inter-island  transports  is  also 
illustrated. 

6.1.  Objective  Analysis  in  Dabob  Bay 

Dabob  Bay  data  are  used  to  illustrate  the  effectiveness  of  the  FMM-based 
scheme  over  other  distance  optimizing  algorithms  like  Bresenham-based  line 
algorithm  (Bresenham,  1965).  Figure  2  shows  maps  of  temperature  and 
salinity  fields  obtained  using  the  spatially  irregular  data  in  the  region  and 
the  a.  Bresenham-based  line  algorithm,  b.  Fast  Marching  Method.  The 
limitation  of  Bresenham-based  line  algorithm  is  that  the  optimal  distance 
computed  using  this  method  is  discontinuous.  This  results  in  discontinuities 
in  the  covariance  and  also  in  the  resultant  field  maps  (Agarwal,  2009). 

The  temperature  and  salinity  field  maps  (Fig.  2)  were  obtained  using 
two-scale  OAs:  one  with  larger  length  scales  (L0  =  60,  Le  =  30)ls  and 
one  with  smaller  length  scales  (L0  =  30,  Le  =  15)me,  in  both  cases  using 
a  non-dimensional  observational  error  variance  (a2d  =  0.25).  Temperature 
and  salinity  data  have  higher  values  in  the  western  arm  (Fig.  2  (top)).  The 
eastern  arm  (Fig.  2  (Middle))  has  relatively  low  temperature  and  salinity. 
Effects  due  to  the  discontinuity  in  distance  obtained  from  Bresenham-based 
line  algorithm  are  clearly  evident  in  Fig.  2  (Middle).  Numerical  fronts  with 
high  temperature  and  salinity  gradients  exist  at  the  intersection  of  the  two 
arms.  Such  fronts  lead  to  numerical  problems  in  dynamical  simulations. 
The  geostrophic  velocity  obtained  using  these  field  maps  is  unrealistic  and 
has  high  magnitudes  along  these  fronts.  A  possible  remedy,  which  reduces 
the  discontinuity  effects,  is  to  smooth  the  distance  by  averaging  distances 
of  neighboring  points  (Haley,  personal  communication).  We  found  that  this 
averaging  technique  becomes  numerically  very  expensive.  In  addition,  the 
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intensity  of  erroneous  fronts  are  reduced  when  this  averaged  Bresenham- 
based  line  algorithm  is  used,  but  they  still  exist.  Finally,  when  our  new  FMM- 
based  scheme  is  used  to  compute  distances  and  to  compute  the  OAs,  results 
are  clearly  devoid  of  any  numerical  fronts  (Fig.  2  (bottom)).  The  FMM-based 
scheme  accurately  satisfies  the  coastline  constraints  and  is  computationally 
inexpensive  when  compared  to  the  Bresenham-based  line  algorithms. 

6.2.  Objective  Analysis  in  the  Philippines  Archipelago 

A  motivation  of  this  study  was  the  Philippines  Straits  Dynamics  Experi¬ 
ment  (PhilEx,  Lermusiaux  et  al.  (2011)).  In  such  a  complex  coastal  region, 
our  new  schemes  were  needed  to  map  the  very  irregular  datasets  available 
and  initialize  simulations.  Without  them,  major  problems  occurred:  neither 
dynamical  studies  nor  ocean  forecasts  could  be  initiated  from  standard  OA 
schemes.  To  illustrate  this,  different  OA  schemes  are  compared  next,  specih- 
cally:  our  new  OA  methods  based  on  the  FMM,  a  standard  OA  scheme  which 
ignores  islands  and  uses  the  direct  Euclidean  distance,  and  a  stochastically 
forced  PDE  scheme  (SPDE  specified  for  the  held). 

The  World  Ocean  Atlas-2005  (Locarnini  et  ah,  2006;  Antonov  et  ah,  2006) 
data  for  temperature  and  salinity  are  used.  WOA-05  data  are  data  mapped 
using  the  ‘Levitus  climatology’  scheme  (see  App.  A)  and  are  regularly  spaced. 
These  data  are  used  here  primarily  to  illustrate  and  discuss  the  comparison  of 
the  different  methodologies  with  a  classical  data  set.  Subsequently,  synoptic 
in  situ  data  sets  are  used  for  temperature,  salinity  and  biological  (chlorophyll) 
data. 

6.2.1.  Objective  Analysis  using  WOA-05  data:  Methods  comparison 

Hydrographic  field  maps.  We  compare  two-dimensional  horizontal  OA 
held  maps  of  the  WOA-05  data  (Figs.  3-8,  Top- Left)  computed  using  schemes 
presented  in  Sect.  4  and  5.  Figures  3,  4  and  5  show  the  temperature  held  maps 
at  the  depth  of  0m,  200m  and  1000m,  respectively.  Figures  6,  7  and  8  show 
the  salinity  held  maps  at  the  depth  of  0m,  200m  and  1000m,  respectively.  For 
the  two-scale  OA  schemes,  the  correlation  function  used  for  each  scale  is  given 
in  Eqn.  4.  The  parameters  are:  large  length  scales  (L0  =  540,  Le  =  180)ls, 
most  energetic  length  scales  (L0  =  180,  Le  =  60)me  and  observational  error 
variance  a ^  =  0.25.  For  the  SPDE  approach,  the  SPDE  parameter  k  is  set  to 
1/200  (this  is  a  best  ht  to  the  correlation  function  used  by  the  other  schemes) 
and  the  observational  error  to  a %  =  0.25. 
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The  OA  field  maps  from  all  methods  (Fig.  3  and  4)  indicate  that  the 
Philippines  Sea  and  the  region  near  Palawan  island  is  warmer  than  the  rest 
of  the  region  near  the  surface  (Om,  200m).  The  region  south  of  the  Sulu 
sea  around  the  Sulu  Archipelago  has  relatively  lower  temperature.  At  levels 
below  500m  (see  Fig.  5),  there  is  a  significant  difference  in  the  temperature  of 
the  Sulu  sea  (warm)  when  compared  to  the  rest  of  the  region  (cold)  (Gamo 
et  ah,  2007;  Gordon,  2009).  These  temperature  fields  show  that  direct  cor¬ 
relation  across  landforms  are  likely  weak.  Similar  observations  can  be  made 
for  Salinity.  Salinity  in  the  Sulu  Sea  and  South  China  Sea  (Fig.  6  and  7)  is 
lower  than  the  salinity  in  the  rest  of  the  region  near  the  surface  (0m,  200m). 
At  levels  below  500m,  the  salinity  in  the  Sulu  Sea  (Fig.  8)  is  significantly 
lower  than  in  the  rest  of  the  region.  These  salinity  fields  further  support  the 
hypothesis  that  direct  correlation  across  landforms  are  weak. 

The  field  maps  obtained  using  the  LSM  and  FMM  are  identical,  but  the 
FMM  has  a  significantly  lower  computational  cost.  While  the  LSM  constructs 
the  distance  estimate  by  iterating  towards  it,  the  FMM  is  based  on  the  direct 
construction  of  the  stationary  solution  (see  Sect.  4).  The  OA  fields  obtained 
using  LSM  and  FMM  are  very  close  because  the  FMM  exactly  constructs 
the  solution  of  the  discretized  Eikonal  equation  whereas  the  LSM  computes 
the  solution  within  a  desired  tolerance  limit.  Thus,  an  OA  based  on  FMM 
should  clearly  be  preferred,  as  it  is  more  accurate  and  less  expensive.  On  the 
other  hand,  the  SPDE  approach  leads  to  OAs  that  are  much  more  noisy  than 
those  obtained  using  the  FMM.  Since  it  is  also  more  expensive,  the  FMM 
scheme  is  superior. 

The  comparison  of  the  different  methods  for  the  temperature  and  salinity 
maps  at  1000m  is  shown  in  Figs.  5  and  8,  respectively.  The  methods  based 
on  FMM  (Figs.  3-8  (Bottom-Left))  and  SPDE  (Figs.  3-8  (Bottom- Right)) 
clearly  satisfy  the  coastline  constraints.  The  data  in  the  Sulu  Sea,  which  has 
high  temperature  and  low  salinity  compared  to  the  remaining  region,  does 
not  influence  the  field  outside  the  Sulu  Sea  since  the  two  regions  are  not  con¬ 
nected  by  water  at  1000m  (assuming  only  2D  horizontal  correlations).  On 
the  other  hand,  the  standard  OA  (Figs.  3-8  (Top-Right))  does  not  satisfy 
the  coastline  constraints.  Thus  the  data  outside  the  Sulu  Sea,  where  the 
temperature  is  low  and  salinity  is  high,  is  correlated  to  the  field  inside  the 
Sulu  Sea.  This  is  undesirable  since  the  direct  relationship  across  landforms 
is  at  best  very  weak.  This  leads  to  spurious  high  temperature  and  salin¬ 
ity  gradients  in  the  Sulu  Sea,  which  creates  large  spurious  geostrophic  flow 
shear.  Differences  between  temperature  field  maps  and  salinity  field  maps 
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obtained  using  the  FMM  and  using  other  OA  methods  at  1000m  are  shown 
in  Fig.  9.  The  differences  between  the  held  maps  obtained  using  the  FMM 
and  standard  OA  are  large  because  the  standard  OA  does  not  incorporate 
the  coastline  constraints.  There  are  small  differences  between  held  maps  ob¬ 
tained  using  the  FMM  and  SPDE  approaches  because:  i)  the  SPDE  scheme 
is  more  sensitive  to  truncation  errors,  and  ii),  the  analytical  correlation  func¬ 
tion  corresponding  to  the  Helmholtz  equation  (used  in  the  SPDE  approach) 
is  slightly  different  from  the  analytical  correlation  function  in  the  FMM. 

The  SPDE  approach  satishes  the  coastline  constraints,  but  the  discretiza¬ 
tion  errors  in  the  SPDE  can  be  significant  and  this  results  in  noisy  spa¬ 
tial  variations  in  the  OA  maps,  even  though  this  noise  is  not  present  in 
the  monthly  hydrographic  data.  This  noise  then  also  negatively  affects  the 
geostrophic  flow  shear,  and  additional  smoothing  (post-processing)  is  often 
needed  to  filter  SPDE-based  OA  fields.  Such  post-processing  is  not  required 
for  our  FMM-based  scheme.  As  mentioned  in  Sect.  5,  an  SPDE  approach 
can  be  implemented  by  specifying  the  SPDE  for  the  held  (as  shown  in  Figs  3- 
8  (Bottom-Right))  or  by  specifying  it  directly  for  the  covariance  (Logutov, 
personal  communication).  The  later  scheme  is  a  bit  cheaper  than  the  former 
but  it  is  a  rough  approximation  and  it  further  increases  the  undesired  noise 
of  the  held  maps.  Finally,  the  computational  time  required  by  the  SPDE 
approach  was  confirmed  higher  than  that  of  the  FMM,  in  accord  with  the 
operation  counts  of  Sect.  5.  Thus,  the  FMM  appears  to  be  the  best  among 
all  the  methods  of  Sects.  4  and  5.  This  was  confirmed  in  many  other  regions 
and  the  FMM  scheme  is  thus  used  to  map  the  spatially  irregular  synoptic 
data  in  the  sections  that  follow. 

Velocity  field,  maps.  We  now  illustrate  the  estimation  of  total  velocity 
under  geostrophic  balance  in  the  region  using  the  above  OA  held  maps  of 
hydrographic  WOA05  data.  The  algorithm  for  optimizing  inter-island  trans¬ 
ports  (App.  C)  is  utilized  to  compute  a  smooth  total  how  held  estimate  under 
the  constraint  of  geostrophic  shear  balance.  Weight  functions  based  on  the 
minimum  vertical  area  along  each  pair  of  islands  are  computed  and  used  in 
the  algorithm.  The  estimation  of  the  minimum  vertical  area  has  been  carried 
out  using  the  FMM  by  specifying  the  scalar  speed  function  in  the  Eikonal 
Eqn.  11  as  F(x,y)  =  1/H(x,y),  where  H  is  the  ocean  depth.  The  tempera¬ 
ture  and  salinity  maps  are  those  of  our  FMM-based  OA  scheme  (Figs.  3-8 
(Bottom-Left))  and  of  the  SPDE  approach  (Figs.  3-8  (Bottom-Right)),  with 
the  Helmholtz  equation  employed  for  the  held.  The  streamfunction  and  ve¬ 
locity  fields  (at  depths  0m,  100m)  are  shown  in  Fig.  10.  The  estimates 
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based  on  our  FMM-based  hydrographic  OAs  (Fig.  10  (Left))  are  in  overall 
good  agreement  with  those  obtained  using  maps  based  on  the  stochastically 
forced  Helmholtz  equation  (Fig.  10  (Right)).  However,  the  SPDE-based  ve¬ 
locity  fields  are  noisier,  reflecting  the  spurious  noise  in  the  hydrographic  OAs. 
On  average,  these  monthly  mean  flow  estimates  suggest  larger  density-driven 
velocities  in  the  Mindoro  strait,  near  the  Mindanao  Island  and  in  the  Bal¬ 
abac  strait.  The  maximum  absolute  velocity,  reaches  80  cm/s  in  the  Balabac 
strait  at  the  surface.  At  lower  depths,  velocities  remain  high  in  the  Mindoro 
strait  and  near  the  Mindanao  Island. 

Weight  functions  based  on  the  minimum  inter-island  distance,  which  can 
be  obtained  using  the  FMM  by  specifying  the  scalar  speed  function  in  the 
Eikonal  Eqn.  11  as  1  for  sea  points  and  0  for  land  points,  were  also  used. 
The  velocity  fields  obtained  using  these  weight  functions  had  much  larger 
magnitudes,  particularly  in  the  Balabac  strait  (Agarwal,  2009)  where  the 
maximum  absolute  velocity  was  141  cm/s.  Such  high  velocity  magnitudes 
are  very  unlikely.  These  results  show  that  weight  functions  based  on  the 
minimum  vertical  area  (which  is  logical  for  transport  estimates)  are  the  most 
adequate. 

6.2.2.  Objective  Analysis  of  synoptic  data  for  the  Summer  2007 

The  data  used  in  this  example  is  collected  from  the  Melville  exploratory 
cruise,  sgl22  and  sgl26  gliders  for  the  June-July’07  period  (Gordon  (2009) 
and  Craig  Lee,  personal  communication).  The  data  coverage  is  shown  in 
Fig.  11  (Top).  A  portion  of  the  Philippines  Archipelago  near  islands  is 
sampled  and  OA  maps  are  computed  in  that  region.  The  scales  used  are: 
large  length  scales  (L0  =  1080, Le  =  360)ls  and  most  energetic  length  scales 
( Lq  =  270,  Le  =  90)me-  The  observational  error  is  set  to  =  0.20.  The  hy¬ 
drographic  held  maps  obtained  using  our  FMM-based  OA  scheme  are  shown 
in  Figs.  12  (Top)  and  13  (Top),  respectively  at  depths  of  Orn  and  200m.  Once 
again,  these  maps  clearly  indicate  that  the  coastline  constraints  are  appro¬ 
priately  satisfied.  At  depth  of  0m,  the  warmer  regions  to  the  west  of  Luzon 
island  remain  uncorrelated  with  the  Pacific  waters  east  of  Luzon.  The  warm 
Sibuyan  and  Visayan  Seas  can  be  distinguished  from  the  relatively  cold  Bo¬ 
hol  Sea.  At  450m  and  1000m,  the  data  in  the  warm  Sulu  Sea  and  Bohol 
Sea  do  not  impact  the  other  regions:  there  is  no  direct  relationship  across 
landforms.  Similar  observations  are  made  for  the  salinity  (e.g.  at  0m,  the  low 
salinities  west  of  Luzon  island  do  not  affect  Pacific  waters  east  of  Luzon). 
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6.2.3.  Objective  Analysis  for  early  Winter  2008 

The  data  used  in  this  example  is  obtained  from  the  joint  Melville  cruise 
for  the  Nov’07-Jan’08  period.  The  data  locations  are  shown  in  Fig.  11  (Bot¬ 
tom).  The  OA  parameters  are:  large  length  scales  (L0  =  1080,  Le  =  360)ls, 
most  energetic  length  scales  (L0  =  270,  Le  =  90)me  and  observational  error 
variance  (a)  =  0.20).  The  hydrographic  field  maps  obtained  using  the  FMM- 
based  scheme  are  shown  in  Figs.  12  (Bottom)  and  13  (Bottom),  respectively, 
at  Orn  and  200m  depth.  At  the  surface,  the  warm/fresher  region  west  of 
Luzon  is  uncorrelated  with  the  region  east  of  Luzon.  At  depths  of  450m  and 
1000m  (not  shown),  the  warm  Bohol  Sea  is  enclosed  and  at  these  depths,  it 
does  not  affect  other  regions  either. 

Comparing  the  Winter  2008  from  the  Summer  20007,  the  largest  differ¬ 
ences  in  temperature  and  salinity  are  near  the  ocean  surface  (deeper  than 
200  m  depth,  fields  are  much  closer).  For  example,  the  Orn  temperature  near 
Luzon  is  significantly  lower  in  Winter  2008  than  in  Summer  2007.  However, 
in  the  Sulu  Sea,  the  temperature  is  nearly  the  same  for  both  Summer  2007 
and  Winter  2008. 

6.2.f.  Objective  Analysis  for  biological  fields  (chlorophyll) 

Of  course,  our  new  FMM-based  scheme  is  not  limited  to  physical  fields. 
Its  application  to  biological  fields  is  illustrated  here  using  the  Exploratory 
cruise  Summer  2007  data  (Gordon,  2009).  Our  biological  OA  field  maps 
(for  chlorophyll,  nitrate  and  ammonium)  were  utilized  to  initialize  coupled 
physics-biology  modeling  studies  (Burton,  2009;  Lermusiaux  et  al.,  2011). 
The  mapping  of  chlorophyll  profiles  is  illustrated  here.  The  profiles  were 
estimated  from  satellite  images  and  a  region-by-region  feature  model  (Ler¬ 
musiaux  et  al.,  2011).  The  biological  OA  parameters  were  fit  to:  large  length 
scales  (Z/0  =  1080,  Le  =  360)ls,  nrost  energetic  length  scales  (L0  =  270, 
Le  =  90)me  and  observational  error  variance  (crj  =  0.20).  The  resulting 
chlorophyll  maps  are  illustrated  in  Fig.  14  at  depths  of  Orn,  10m,  50m,  160m. 

The  concentration  of  biological  fields  like  chlorophyll,  phytoplankton  and 
zooplankton  is  substantial  near  the  surface  clue  to  the  presence  of  sunlight. 
The  chlorophyll  concentration  is  maximum  near  islands  often  driven  by  winds 
or  bathymetric  upwelling.  Away  from  islands,  it  tends  to  be  more  uniform, 
around  a  mean  value.  At  0m  and  10m  depths,  the  maximum  chlorophyll 
concentration  is  observed  south  of  the  Visayan  sea  and  in  the  Bohol  Sea. 
At  50m  depth,  chlorophyll  concentrations  remain  significant  there,  but  the 
largest  chlorophyll  concentrations  are  observed  north  of  Palawan  island.  Bi- 
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ological  concentrations  at  lower  depths  decrease  rapidly. 


7.  Computational  Analysis 

The  computational  properties  of  our  methods  for  mapping  irregular  data 
in  complex  geometries  are  now  described  and  studied.  First,  the  computa¬ 
tional  costs  are  compared  in  Sect.  7.1.  Then,  schemes  to  resolve  issues  specific 
to  complex  multiply-connected  coastal  regions  such  as  the  need  for  accurate 
distance  estimates  (Sect.  7.2)  and  the  need  for  positive-definite  covariance 
matrices  (Sect.  7.3)  are  discussed.  These  schemes  are  important  because  if 
the  covariance  matrix  becomes  negative,  divergence  problems  occur  in  the 
Kalman  updates  (Brown  and  Hwang,  1997). 

To  motivate  the  computational  studies,  recall  that  we  generate  the  covari¬ 
ance  matrices  using  analytical  correlation  functions  defined  based  on  the  Eu¬ 
clidean  distance.  These  correlation  functions  are  termed  “positive  definite” 
if  they  generate  positive  definite  covariance  matrix  in  a  simply-connected 
convex  domain.  It  has  been  well  established  using  the  Wiener-Khinchin  and 
Bochner’s  theorems  that  if  the  Fourier  transform  (or  the  spectral  density)  of 
a  correlation  function  is  non-negative  for  all  frequencies  then  the  correlation 
function  is  positive  definite  (Yaglom,  1987;  Papoulis,  1991;  Yaglom,  2004; 
Dolloff  et  al.,  2006).  However,  we  found  that  for  coastal  regions,  covari¬ 
ance  matrices  generated  from  “positive  definite  correlation  functions”  may 
not  be  positive  definite  due  to:  a.  numerical  errors  in  the  computation  of 
the  shortest  path  lengths  using  our  FMM/LSM  schemes,  or  b.  the  presence 
of  landfornrs  which  lead  to  multiply-connected  or  non-convex  domains,  in¬ 
validating  assumptions  in  the  Wiener-Khinchin  and  Bochner  theorems  (see 
(Agarwal,  2009)  for  proof).  This  can  lead  to  divergence  problems  in  the 
mapping.  Such  problems  are  illustrated  using  the  WOA-05  data  (Spliced 
February  and  Winter  Climatology)  shown  in  Fig.  15  (Top-Left).  To  simplify, 
we  consider  single-scale  OAs  (all  previous  example  were  two-scale  OAs). 
The  held  maps  obtained  using  our  FMM-based  scheme  with  length  scales 
(L0  —  540,  Le  =  180)  and  length  scales  (L0  =  1080,  Le  =  360)  are  shown 
in  Fig.  15.  Fields  obtained  using  the  larger  scales  (Fig.  15  (Bottom-Left)) 
clearly  show  divergence  problems  near  the  Palawan  island.  These  problems 
are  not  encountered  when  the  smaller  length  scales  are  used  and  are  much 
smaller  when  a  higher-order  FMM  scheme  is  used  (Fig.  15  (Top-Right)). 
Questions  which  motivate  our  research  in  Sects.  7.2  and  7.3  are  thus:  i) 
What  are  the  computational  errors  in  the  shortest  sea-path  distances  com- 
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puted  using  the  FMM/LSM  and  how  can  they  be  reduced?,  and  ii)  What 
are  the  computational  issues  including  non-positive  definite  covariances  that 
arise  in  a  multiply-connected  coastal  domain  and  how  can  they  be  reme¬ 
died?  A  higher-order  FMM  than  the  first-order  one  (Sect.  4.2)  is  discussed 
in  Sect.  7.2.  Higher-order  FMMs  significantly  reduce  errors  in  the  distance 
estimates,  i.e.  the  difference  between  the  numerically  computed  and  true 
distance,  which  limits  divergence  problems  in  the  mapping.  However,  even 
if  exact  distances  are  used,  when  curved  boundaries  or  islands  are  present 
in  the  domain,  negative  covariances  can  still  occur.  Methods  to  solve  these 
issues  are  derived  in  Sect.  7.3. 

7.1.  Comparison  of  Computational  Costs 

For  all  OA  schemes,  we  sequentially  process  observations  (see  Parrish 
and  Cohn  (1985);  Clio  et  al.  (1996);  Lermusiaux  (1997);  Agarwal  (2009)). 
Such  sequential  processing  drastically  reduces  computational  costs  and  also 
allows  estimating  the  impact  of  individual  data.  Since  data  are  processed 
sequentially,  the  costs  for  the  OA  schemes  are  compared  by  considering  a 
single  scalar  data  point.  The  main  cost  is  then  the  computation  of  the 
covariance  of  that  data  point  with  all  other  grid  points  in  the  domain.  For 
the  FMM,  LSM  and  Dijkstra’s  schemes,  the  operation  count  to  do  this  is 
driven  by  the  computation  of  the  shortest  distances  from  that  data  point 
with  all  other  points.  For  the  SPDE  scheme,  it  depends  on  the  diffusion 
equation  used  and  on  the  iterations  to  reach  state-state.  For  a  2-D  domain 
with  N  points  in  each  direction,  these  operation  counts  are  given  in  Table  2. 


Method 

Operation  Count 

Level  Set  Method 

0(N3) 

Fast  Marching  Method 

0(N2logN) 

SPDE  Method 

0  (N2n) 

Dijkstra’s  Method 

0(N3) 

Table  2:  Operation  counts  for  computing  the  covariances  among  one  data 
point  and  each  of  the  N2  model  grid  points,  as  obtained  using  the  LSM, 
FMM,  SPDE  (n  iterations)  and  Dijkstra’s  schemes. 


There  are  a  total  of  N2  grid  points  at  each  level  and  the  operation  count 
for  LSM  is  obtained  from  an  optimistic  guess  that  LSM  will  take  roughly 
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N  steps  to  converge.  In  reality,  the  iterations  can  take  much  longer  to  con¬ 
verge,  and  the  LSM  is  thus  not  efficient  to  compute  these  distances.  On 
the  other  hand,  FMM  is  an  efficient  technique  which  requires  a  fast  method 
to  locate  the  smallest  value  grid  point  in  the  narrow  band.  The  Min-Heap 
data  structure  with  backpointers  (Sedgewick,  1988)  is  employed  here  to  effi¬ 
ciently  locate  the  grid  point  with  the  minimum  value.  The  total  work  done 
in  the  DownHeap  and  UpHeap  operations,  which  ensure  that  the  updated 
quantities  do  not  violate  the  heap  properties,  is  0(log  N).  Thus,  for  a  2D  do¬ 
main  with  N  grid  points  in  each  direction,  the  FMM  has  an  operation  count 
of  N2logN,  which  is  a  significant  improvement  over  the  LSM.  An  efficient 
SPDE  scheme  requires  at  least  an  order  of  N2n  where  n  is  the  number  of  it¬ 
erations  to  reach  steady  state.  We  have  observed  that  the  SPDE  approach  is 
at  least  15%  more  expensive  computationally  than  the  FMM  scheme.  Thus, 
the  FMM-based  scheme  is  computationally  the  most  efficient. 


1.2.  Higher  order  Fast  Marching  Method 

In  a  domain  with  no  islands  or  landforms,  the  shortest  path  length  ob¬ 
tained  using  the  FMM/LSM  should  be  equal  to  the  Euclidean  distance.  But 
the  FMM/LSM  have  discretization  errors  which  lead  to  inaccurate  length 
estimates.  The  Weiner  Khinchin  and  Bochner  theorems  are  valid  for  covari¬ 
ances  computed  using  the  Euclidean  distance  in  a  simply-connected  convex 
domain.  So,  if  the  domain  is  simply-connected  convex,  the  covariance  ma¬ 
trix  can  only  become  negative  definite  due  to  the  inaccurate  length  estimates. 
This  may  lead  to  divergence  problems  in  the  resultant  held  maps.  In  this 
Section,  the  goal  is  to  estimate  and  reduce  the  computational  errors  in  the 
shortest  path  lengths.  We  first  introduce  the  higher  order  FMM  which  re¬ 
duces  these  errors. 

The  FMM  scheme  presented  in  Sect.  4.2  is  first  order,  since  the  first 
order  discretization  form  (Eqn.  14)  of  the  Eikonal  Eqn.  11  is  used.  A  dif¬ 
ferent  implementation  of  FMM  with  higher  accuracy  (Sethian,  1999a, b)  is 
discussed  here.  It  employs  the  second  order  backward  approximation  to  the 
first  derivative  Tx  is  given  by: 
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and  the  second  order  forward  approximation  to  the  first  derivative  Tx  given 
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by: 
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Here  D  x  and  D+x  are  the  first  order  forward  and  backward  approxima¬ 
tions  for  the  first  derivative,  respectively  (Eqn.  10),  D~x~x  =  D~XD~X and 

JJ+X+X  —  JJ+x  ]J+x 

Consider  the  switch  functions  defined  by: 


switch,  x 


switchfx 

LJ 


1  if  Tj_2j  and  T,_ ij  are  known  (‘Alive’) 
and  l (  2 ./  A  ^i—ij 
0  otherwise 

1  if  Ti+2J  and  Ti+lj  are  known  (‘Alive’) 
and  I  f  ~  '2.j  A  "A+ 1  .j 
0  otherwise 


(20) 


Similar  functions  are  defined  in  the  y-direction.  The  higher  accuracy  scheme 
attempts  to  use  a  second  order  approximation  for  the  derivative  whenever 
the  points  are  tagged  as  ‘alive’  (the  points  inside  the  band  where  the  value 
of  the  arrival  time  function  is  frozen:  see  Sect.  4.2)  but  reverts  to  the  first 
order  scheme  otherwise. 

The  modified  discretization  equation  for  the  higher  accuracy  FMM  is  thus 
given  by: 


f  rnax{[D~xT  +  switch^ ^D^X~XT], 

-[D±XT  -  switch^ 4r Dfx+xT } ,  0)2 

+ 

mnifiD./T  +  sw/trhJ^D,^1  yT\, 

-H’t  -  switch?? % fl.?+'r],0)2 

(21) 


It  should  be  noted  that  the  above  scheme  is  not  necessarily  a  second  order 
scheme.  Its  accuracy  depends  on  how  often  the  switches  evaluate  to  zero  and 
how  the  number  of  points  where  the  first  order  method  is  applied  changes 
as  the  mesh  is  refined.  When  the  number  of  points  where  the  first  order 
method  is  applied  is  relatively  small  (occurs  only  near  the  coastlines),  the 
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error  is  reduced  considerably  by  using  the  higher  accuracy  FMM  (Agarwal, 
2009).  It  should  also  be  noted  that  a  third  or  higher-order  approximations 
for  the  derivative  Tx  can  similarly  be  used  to  construct  more  accurate  FMM 
schemes,  but  this  increases  the  computational  cost.  We  also  found  that  the 
relative  error  in  the  distances  computed  by  the  FMM  is  higher  near  the  data 
point  and  it  decays  as  the  distance  increases.  To  keep  the  computational 
cost  low  and  a  uniform  relative  error,  we  can  thus  use  higher  accuracy  FMM 
near  the  data  points  and  then  progressively  shift  to  the  lower  order  schemes 
as  the  distance  increases. 

The  results  of  using  higher  order  FMMs  to  minimize  errors  in  the  estima¬ 
tion  of  the  shortest  path  length  are  illustrated  on  Fig.  15  (Bottom-Right). 
They  clearly  show  that  the  above  higher  order  FMM  has  attenuated  the  di¬ 
vergence  issues  compared  to  the  first  order  FMM.  The  divergence  issues  do 
not  vanish  completely  because  some  discretization  errors  still  occur  but  also 
because  of  the  presence  of  landforms.  To  deal  with  the  latter,  which  are  due 
to  the  multiply-connected  coastal  domains,  we  further  improve  schemes  next 
in  Sect.  7.3. 

7.3.  Positive  Definite  covariance  matrix  for  complex  multiply- connected  coastal 
regions 

Apart  from  the  inaccurate  shortest  path  length,  the  covariance  matrix 
may  also  become  negative  due  to  the  presence  of  islands  and  coastlines. 
This  is  because  the  presence  of  islands  and  archipelagos  stretches  the  direct 
Euclidean  path,  which  can  render  the  covariance  matrix  negative. 

For  example,  consider  the  idealized  multiply-connected  domain  with  an 
island,  shown  on  Fig.  16.  This  domain  has  12  grid  points  marked  as  ocean 
points  and  4  grid  points  marked  as  land  points.  The  length  of  the  shortest 
sea  path  is  computed  exactly  to  form  the  covariance  matrix  and  so  remove 
all  discretization  errors  of  the  FMM/LSM.  To  do  so,  the  positive-definite 

correlation  function  Cor(r)  =  exp  —fjjz  with  L=2  is  used.  We  find  that 
the  covariance  matrix  is  not  positive  definite.  The  maximum  eigenvalue  for 
the  covariance  matrix  is  6.3345  while  the  minimum  is  -0.0504.  This  idealized 
example  clearly  reveals  that  classic  Euclidean-based  covariance  matrices  for 
a  complex  multiply-connected  region  may  not  necessarily  be  positive  defi¬ 
nite.  This  is  because  the  conditions  of  the  Wiener- Khinchin  and  Bochner’s 
theorems  are  not  satisfied. 

One  could  consider  changing  the  coordinate  system,  for  example  curvilin- 
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ear  coordinates.  For  adequate  coordinate  choices,  in  the  transformed  space, 
the  domain  can  then  be  simply  connected  and  convex.  However,  the  issue 
then  is  that  the  real  distances  among  grid  points  become  position  dependent 
which  violates  another  assumption  of  the  Wiener  -Khinchin  and  Bochner’s 
theorem,  see  (Agarwal,  2009)  for  examples  and  more  discussions. 

Hence,  other  schemes  have  to  be  used  to  alleviate  the  divergence  prob¬ 
lems  (Fig.  17  (Top-Left))  due  to  the  non-positive  definite  covariance  matrix. 
They  include: 

a.  Discarding  the  problematic  data:  Discarding  the  data  that  lead  to 
negative  values  of  HjCor(x,  x)j_1H-r  would  solve  the  issue  and  eliminate  di¬ 
vergences  in  the  resultant  OA.  However,  this  method  is  a  not  adequate  since 
the  information  in  the  data  is  discarded  entirely.  The  held  map  obtained  by 
discarding  the  problematic  data  is  shown  in  Fig.  17  (Top-Right).  Clearly, 
the  divergence  problems  are  removed  but  loosing  data  is  not  acceptable. 

b.  Introducing  process  noise:  Adding  a  small  process  noise  to  the  di¬ 
agonal  elements  of  the  covariance  matrix  would  help  (Brown  and  Hwang, 
1997),  but  it  will  lead  to  a  degree  of  sub-optimality:  the  noise  affects  all  of 
the  problematic  data.  However,  it  is  often  a  more  acceptable  scheme  than 
discarding  the  data.  We  indeed  find  that  introducing  the  process  noise  leads 
to  less  divergence  problems,  as  shown  in  our  example,  see  Fig.  17  (Bottom- 
Left). 

c.  Dominant  Singular  Value  Decomposition  (SVD)  of  a-priori  co- 
variance:  To  construct  the  OA  held  maps,  the  full  covariance  matrix  is 
not  required.  In  fact,  the  full  covariance  matrix  (Cor(x,x))  is  expensive  to 
compute  and  store,  and  it  is  therefore  rarely  computed.  The  necessary  re¬ 
quirement  for  held  maps  is  the  covariance  matrix  among  the  grid  and  data 
points,  i.e.  Cor(x,X).  The  divergence  problems  can  be  removed  by  hrst 
obtaining  the  singular  value  decomposition  (SVD)  of  Cor(x,X)  and  then  re¬ 
taining  only  the  dominant  singular  values  and  setting  the  smaller  singular 
values  (e.g.  less  than  1  percent  of  the  maximum  singular  value)  to  zero.  This 
SVD  procedure  renders  the  covariance  matrix  non-negative  definite,  which 
was  verihed  in  multiple  examples  where  a  simulated  map  was  used  for  a  true 
ocean.  Based  on  these  results  and  on  minimum  error  variance  arguments, 
the  dominant  SVD  method  is  the  most  acceptable  one  because  it  loses  the 
least  information  contained  in  the  data.  Our  example  is  shown  on  Fig.  17 
(Bottom- Right).  We  hnd  that  the  held  maps  obtained  using  this  dominant 
SVD  of  the  a-priori  covariance  is  free  from  divergence  problems.  They  are 
also  similar  to,  but  further  improve,  the  helds  obtained  by  introducing  the 
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process  noise. 


8.  Summary  and  Conclusions 

New  methodologies  for  the  efficient  mapping  and  dynamical  inference 
of  ocean  fields  from  irregular  data  in  complex  multiply-connected  domains 
were  derived  and  utilized,  and  computational  properties  of  these  mapping 
schemes  were  studied.  These  new  OA  methods,  which  satisfy  the  coastline 
and  bathymetry  constraints  (e.g.  there  is  no  direct  relationship  across  land- 
forms),  are  based  on  estimating  the  length  of  the  optimal  sea  path  using 
either  the  Level  Set  Method  (LSM)  or  the  Fast  Marching  Method  (FMM). 
The  optimal  sea  path  was  geometrically  defined:  i.e.  for  purely  horizontal 
OAs,  it  is  the  shortest  sea  distance  in  2D,  and  for  3D  OAs,  it  is  the  shortest 
sea  distance  in  3D,  weighting  the  vertical  or  diapycnal  distances  more  than 
the  horizontal  ones.  Numerical  schemes  were  derived  and  implemented,  and 
their  operation  counts  compared.  Their  properties  and  results  were  studied 
in  complex  domains,  the  Philippines  Archipelago  and  Dabob  Bay,  in  re¬ 
alistic  situations.  Both  climatological  and  synoptic  datasets  were  employed 
and  estimates  of  temperature,  salinity  and  biological  (chlorophyll)  fields  were 
computed  and  discussed.  We  found  that  without  these  new  OA  methods,  nei¬ 
ther  meaningful  dynamical  studies  nor  meaningful  ocean  simulations  could 
be  initiated. 

Results  were  compared  with  those  of  a  standard  OA  scheme  (using  across- 
landforms  Euclidean  distance  in  the  analytical  correlation  function),  of  OA 
schemes  based  on  other  distance  estimation  methods  and  of  OA  schemes 
based  on  the  use  of  stochastically  forced  PDEs  (SPDEs).  We  showed  that  the 
FMM-based  scheme  is  computationally  cheaper  than  the  LSM-based  scheme 
and  diffusion-based  SPDE  approach.  We  found  that  the  held  maps  obtained 
using  our  FMM-based  schemes  were  more  robust  than  those  obtained  using 
SPDE  schemes:  fields  did  not  require  postprocessing  (smoothing),  i.e.  they 
were  devoid  of  any  spurious  gradients.  Such  spurious  gradients  in  hydro- 
graphic  maps  lead  to  unrealistic  geostrophic  hows.  The  FMM  and  LSM 
were  the  most  appropriate  for  estimating  the  optimal  sea  distances  among 
other  distance  estimation  schemes  such  as  Dijkstra’s  optimization  algorithm 
and  the  classic  Bresenham-based  line  algorithm.  The  optimal  distance  com¬ 
puted  using  Dijkstra’s  algorithm  is  computationally  expensive  and  inaccu¬ 
rate.  Apart  from  being  computationally  expensive,  the  optimal  distance 
computed  using  the  Bresenham  line  algorithm  is  discontinuous.  This  results 
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in  the  formation  of  numerical  fronts  with  high  held  gradients.  Such  erroneous 
gradients  do  not  occur  when  our  FMM-based  scheme  is  utilized. 

Mathematical  and  computational  properties  of  the  new  OA  schemes  were 
studied.  The  sequential  processing  of  observations  reduces  the  computational 
cost  and  also  helps  in  understanding  the  impact  of  individual  data.  We  found 
that  the  use  of  higher  order  FMMs  increased  the  accuracy  of  the  estimates 
of  the  length  of  shortest  sea  paths.  The  most  efficient  FMM  schemes  de¬ 
rived  employed  a  variable  order  discretization,  the  order  decaying  as  the 
distance  between  the  data  and  model  points  increases.  Accurate  FMM  dis¬ 
tance  estimates  eliminate  one  of  the  sources  of  negative  covariance  matrices. 
The  other  source  is  simply  the  presence  of  islands  or  of  other  non-convex 
landforms.  This  is  because  the  Wiener- Khinchin  and  Bochner  theorems  are 
valid  only  for  correlation  functions  based  on  the  Euclidean  distance  in  convex 
simply-connected  domains.  Several  approaches  to  overcome  this  issue  were 
discussed.  These  include  discarding  problematic  data  points,  introducing 
process  noise,  and  reducing  the  covariance  matrix  by  applying  the  dominant 
singular  value  decomposition  (SVD).  Among  these,  we  argued  and  showed 
that  the  latter  use  of  the  SVD  to  reduce  the  covariance  matrix  is  the  best 
solution. 

We  have  also  employed  a  FMM-based  method  to  estimate  the  total  ve¬ 
locity  under  geostrophic  balance  in  complex  multiply-connected  domains. 
The  FMM  is  used  to  compute  the  minimum  vertical  area  between  all  pairs 
of  islands.  Theses  minimum  areas  are  required  to  compute  the  transport 
streamfunction  held  that  optimizes  the  inter-island  transports  and  produces 
a  smooth  velocity  held.  The  result  is  a  mass-conserving  geostrophic  how 
in  balance  with  the  hydrographic  OA  maps  and  with  optimized  inter-island 
transports.  This  method  and  the  minimum  vertical  area  estimates  were  nec¬ 
essary  to  obtain  realistic  velocity  estimates  in  our  Philippines  Archipelago 
examples. 

As  part  of  our  ongoing  work,  we  have  started  to  incorporate  additional 
geometrical  and  non-homogeneous  dynamical  effects  to  our  FMM-based  OA 
scheme.  An  approach  we  have  followed  is  to  modify  the  scalar  speed  func¬ 
tion  in  the  Eikonal  equation  as  a  function  of  these  geometrical  properties 
and  heterogeneous  dynamics.  In  particular,  we  have  utilized  a  bathymetry- 
dependent  scalar  speed  function  to  include  bathymetric  effects  at  lower  depth 
levels.  To  include  heterogeneous  scales  due  to  the  existence  of  fronts,  we  can 
hrst  create  an  expected  length  scale  held  that  is  a  function  of  space  and 
direction,  possibly  using  raw  data  only  (Agarwal,  2009)  or  a  feature  model 
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(Gangopadhyay  and  Robinson,  2002).  We  can  then  compute  the  optimal 
sea  path  as  before,  but  select  the  correlation  scales  as  the  smallest  scales 
found  along  that  path.  For  example,  if  the  optimal  path  crosses  a  front,  the 
length  scale  in  the  across  direction  would  then  be  the  minimum  cross-frontal 
scale.  Analogous  modification  of  the  scalar  speed  function  or  the  length  scale 
can  be  used  to  incorporate  other  dynamical  effects  (e.g.  conservation  of  po¬ 
tential  vorticity).  In  the  future,  the  ideas  of  optimal  path  length  and  our 
FMM/LSM-based  scheme  can  be  used  to  extend  to  complex  coastal  regions 
our  methodology  for  3D  multivariate  and  multi-scale  spatial  mapping  of  geo¬ 
physical  fields  of  their  dominant  errors  (Lermusiaux  et  ah,  2000;  Lermusiaux, 
2002).  Such  schemes  would  be  needed  for  ensemble  initializations. 

We  expect  a  wide  range  of  applications  for  our  new  FMM-based  mapping 
schemes.  Already  when  mapping  relatively  simple  coastal  domains,  the  con¬ 
straints  of  landforms  are  not  often  accounted  for.  Constraints  due  to  bathy¬ 
metric  features  should  also  be  respected,  even  in  deep  ocean  regions,  from 
the  simpler  basins,  plateaus  and  troughs  to  the  more  complex  sills,  ridges, 
seamounts  and  trenches.  Initial  gridded  conditions  computed  by  the  present 
FMM  methods  have  already  enabled  our  simulations  in  varied  regions,  includ¬ 
ing  the  Taiwan  region,  New  England  shelf,  Dabob  Bay  and  Monterey  Bay  (Xu 
et  al.,  2008;  Lermusiaux  et  ah,  2010;  Haley  and  Lermusiaux,  2010).  Our  new 
methods  would  also  improve  the  widely-used  gridded  ocean  databases  such 
as  the  World  Ocean  Atlas  (WOA)  since  such  oceanic  maps  were  computed 
without  explicitly  accounting  for  coastline  and  bathymetry  constraints. 
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A.  Objective  Analysis  Schemes  of  the  ‘Levitus  Climatology’ 

The  OA  schemes  used  to  map  the  ‘Levitus  Climatology’  (Levitus,  1982; 
Locarnini  et  ah,  2006;  Antonov  et  ah,  2006;  Garcia  et  ah,  2006a, b)  have  their 
origins  in  the  work  of  Cressman  (1959)  and  Barnes  (1964).  The  approach  is 
based  on  adding  “corrections”,  which  are  computed  as  a  distance-weighted 
mean  of  all  data  point  difference  values,  to  the  first-guess  field.  Initially,  to 
reduce  the  computational  time,  the  World  Ocean  Atlas  1994  (WOA94)  used 
the  Barnes  (1973)  scheme  which  requires  only  a  single  “correction”  to  the 
first-guess  field  at  each  grid  point  in  comparison  to  the  successive  correction 
method  of  Cressman  (1959)  and  Barnes  (1964).  The  most  recent  WOA98, 
WOA01  and  WOA05  maps  were  completed  employing  a  three-pass  “correc¬ 
tion”  scheme,  using  the  multi-pass  analysis  of  Barnes  (1994).  The  inputs  to 
this  global  analysis  scheme  are  differences  among  a  first-guess  field  and  the 
one-degree  square  means  of  the  observed  data  values.  An  influence  radius 
is  then  specified  and  a  correction  to  the  first-guess  value  at  all  grid  points 
is  computed  as  a  distance-weighted  mean  of  only  the  difference  values  that 
correspond  to  data  points  that  lie  within  the  area  defined  by  the  influence 
radius.  Mathematically,  the  correction  factor  derived  by  Barnes  (1964)  is 
given  by: 


Sti  W,Q, 
Et,  w. 


(22) 


where, 

(i,  j)  -  are  coordinates  of  grid  points; 

ChJ  -  correction  factor  at  the  grid  point  coordinates  (i,  j); 
d  -  the  number  of  data  points  that  fall  within  the  area  around  point  (i,j) 
defined  by  the  influence  radius; 

Qs  -  difference  between  the  observed  mean  and  the  first-guess  at  the  sth  data 
point  in  the  influence  area; 

Ws  =  exp(—Er2/R2)  (for  r  <  R ;  Ws  =  0  for  r  >  R)  -  the  correlation  weight; 
r  -  distance  between  data  and  grid  points; 

R  -  influence  radius;  and, 

E  =  4. 

At  each  grid  point,  the  final  analyzed  gridded  value  Gij  is  the  sum  of  the 
first  guess  Fl:J  and  the  correction  CitJ.  The  expression  is: 


Gij  —  Fij  +  C, 


(23) 
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If  there  is  no  data  within  the  area  defined  by  the  influence  radius,  the  cor¬ 
rection  is  zero  and  the  analyzed  value  is  the  first-guess.  The  analysis  scheme 
is  set  up  such  that  the  inference  radius  can  be  varied  at  each  iteration.  To 
progressively  analyze  the  smaller  scale  phenomena  with  each  iteration,  the 
analysis  begins  with  a  large  inference  radius  which  is  decreased  gradually 
with  each  iteration. 

Equation  23  can  also  be  expressed  in  a  matrix-vector  form, 

G  =  F  +  [dia^(Wed)]_1WQ  (24) 

where  if  n  and  d  denote  the  number  of  model-grid  and  data  points,  respec¬ 
tively,  the  analyzed  held  G  and  the  first  guess  F  are  n-by-1,  the  correlation 
weight  matrix  W  is  n-by-d,  the  difference  Q  between  the  observed  mean  and 
first-guess  at  data  points  is  d-by-1,  and  ed  is  d-by-1  with  unit  entities.  The 
operation  diag(v)  creates  a  diagonal  matrix  i.e.  it  puts  the  vector  v  on  the 
main  diagonal. 

In  analogy  to  the  Kalman  Gain  (K)  from  the  Gauss  Markov  criterion 
(K  =  Cor(x,  X)[C*or(X,  X)  +  R]-1),  Equations  24  and  1  show  that  a  sim¬ 
ilar  Gain  matrix  [KJj  =  [dia<?(Wed)]_1W)  can  be  defined  for  the  Levitus 
methodology.  While  the  multi-scale  OA  approach  in  MSEAS  is  based  on 
Gauss  Markov  estimation  theory  and  successive  scale-by-scale  updates,  the 
Levitus  OA  is  based  on  computing  the  distance-weighted  mean  of  all  dif¬ 
ferences  between  the  most  recent  first-guess  held  and  the  data  mean  within 
the  inference  radius  and  then  repeat  with  a  reduced  inference  radius.  The 
main  difference  is  that  Gauss  Markov  estimation  theory  requires  and  uses 
prior  error  covariances  for  the  data  and  the  hrst-guess,  while  the  Levitus  OA 
requires  radius  of  influence  estimates  and  uses  data  averaging. 

B.  Fast  Marching  Algorithm 

The  fast  marching  algorithm  (Sethian,  1996,  1999b)  is: 

1.  Initialize 

(a)  Alive  points:  Let  A  be  the  set  of  all  grid  points  (i,j)  on  the  starting 
position  of  the  interface  T;  set  Ty  =  0  for  all  points  in  A. 

(b)  Narrow  Band  points:  Let  the  Narrow  Band  be  the  set  of  all  grid 
points  (i,j)  in  the  immediate  neighborhood  of  A;  set  Tl}  =  -d- 
for  all  points  in  the  Narrow  Band  where,  d  is  the  grid  separation 
distance  and  F  is  the  front  speed  (see  Eqn.  13). 
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(c)  Far  Away  points:  Let  the  Far  Away  region  be  the  set  of  all  re¬ 
maining  grid  points  (i.j);  set  =  oo  for  all  points  in  the  Far 
Away  region. 

2.  Marching  Forward 

(a)  Begin  Loop:  Let  ( iminJmin )  be  the  point  in  the  Narrow  Band  with 
the  smallest  value  for  T. 

(b)  Add  the  point  ( iminJmin )  to  A ;  remove  it  from  the  Narrow  Band. 

(c)  Tag  as  neighbors  any  points  (w— (imin+IJmin),  ( iminJmin ~ 
1) ,  ( iminJmin  +  1)  that  are  either  in  the  Narrow  Band  or  the  Far 
Away  region.  If  the  neighbor  is  in  the  Far  Away  region,  remove  it 
from  that  list  and  add  it  to  the  Narrow  Band. 

(d)  Recompute  values  of  T  at  all  neighbors  in  accordance  with  Eqn.  14. 
Select  the  largest  possible  solution  to  the  quadratic  equation. 

(e)  Return  to  the  top  of  the  loop. 

Here  are  some  properties  of  the  fast  marching  algorithm.  The  smallest 
value  in  the  Narrow  Band  is  always  correct.  Other  Narrow  Band  or  Far 
Away  points  with  larger  values  of  T  cannot  affect  the  smallest  value.  Also, 
the  process  of  recomputing  T  values  at  the  neighboring  points  cannot  give  a 
value  smaller  than  any  of  the  accepted  value  at  Alive  points ,  since  the  correct 
solution  is  obtained  by  selecting  the  largest  possible  solution  to  the  quadratic 
equation  (Eqn.  14).  Thus  the  algorithm  marches  forward  by  selecting  the 
minimal  T  value  in  the  Narrow  Band  and  recomputing  the  values  of  T  at  all 
neighbors  in  accordance  with  Eqn.  14. 

The  key  to  an  efficient  version  of  the  algorithm  lies  in  hireling  a  fast  way 
to  locate  the  grid  point  in  the  Narrow  Band  with  the  minimum  value  for 
T.  To  do  so,  the  heapsort  algorithm  (Williams,  1964;  Sedgewick,  1988)  with 
backpointers  is  often  implemented  and  it  is  the  algorithm  we  used  here.  This 
sorting  algorithm  generates  a  “complete  binary  tree”  with  the  property  that 
the  value  at  any  given  parent  node  is  less  than  or  equal  to  the  value  at  its 
child  node.  Heap  is  represented  sequentially  by  storing  a  parent  node  at  the 
location  k  and  its  child  at  locations  2k  and  2k  +  1.  The  member  having  the 
smallest  value  is  stored  at  the  location  k  —  1. 

All  Narrow  Band  points  are  initially  sorted  in  a  heapsort.  The  fast  march¬ 
ing  algorithm  works  by  first  finding,  and  then  removing,  the  member  corre¬ 
sponding  to  the  smallest  T  value  from  the  Narrow  Band  which  is  followed  by 
one  sweep  of  DownHeap  to  ensure  that  the  remaining  elements  satisfy  the 
heap  property.  The  DownHeap  operation  moves  the  element  downwards  in 
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the  heap  until  the  new  heap  satisfies  the  heap  properties.  Far  Away  neigh¬ 
bors  are  added  to  the  heap  using  the  Insert  operation  which  increases  the 
heap  size  by  one  and  brings  the  new  element  to  its  correct  heap  location  using 
the  UpHeap  operation.  The  UpHeap  operation  moves  the  element  upwards 
in  the  heap  until  the  new  heap  satisfies  the  heap  properties.  The  updated 
values  at  the  neighbor  points  obtained  from  Eqn.  14  are  also  brought  to  the 
correct  heap  location  by  performing  the  UpHeap  operation. 


C.  Estimating  the  total  velocity  field  under  geostrophic  balance  by 

minimizing  unknown  inter-island  transports 

For  mesoscale  ocean  flows,  away  from  boundary  layers,  the  dominant 
terms  in  the  horizontal  momentum  equations  are  often  the  Coriolis  force  and 
the  pressure  gradient.  Such  a  flow  field,  where  a  balance  is  struck  between 
the  Coriolis  and  pressure  forces,  is  called  geostrophic.  The  thermal  wind 
equations  are  obtained  for  geostrophic  flows  by  assuming  that  the  vertical 
momentum  equation  is  approximately  given  by  hydrostatic  balance.  The 
thermal  wind  equations  are: 


,d  (pv)  dp  ,  rd  (pu)  dp 
-f~a —  =  9^~  and  f— —  =  g  — 
dz  dx  dz  dy 


(25) 


where,  p  is  the  density,  u  and  v  are  the  horizontal  fluid  velocity  in  the  zonal 
(x)  and  meridional  (y)  directions  respectively,  and  /  =  2 sincp  is  the  Coriolis 
parameter  at  latitude  (j)  for  the  spherical  earth  rotating  at  a  rate  of  H.  The 
thermal  wind  Eqns.  25  when  integrated  in  the  vertical  give: 


pv(x,y,z,t )  =  -j- 
pu(x,y,z,t )  =  j 


dp 

dx 

dp 

dy 


dz  +  pv0 
dz  +  puo 


(26) 


where,  z0  is  a  level  of  reference  where  v0,  u0  are  assumed  known  {zq  is  referred 
to  the  level  of  no  motion  if  u0,  Uq  —  0). 

Flow  estimation  based  on  thermal  wind  balance  (Eqn.  26)  is  a  classical 
problem  in  oceanography  (Wunsch,  1996).  Historically,  the  main  routine 
measurements  were  hydrographic:  temperature,  T,  and  salinity,  S,  at  vari¬ 
ous  depths.  The  equation  of  state  for  seawater  then  permits  the  estimation 
of  density  at  a  given  pressure  from  these  hydrographic  data.  Thus,  with 
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Eqn.  26,  the  vertical  shear  of  the  geostrophic  flow  can  be  computed  from 
hydrographic  data  alone  and  added  to  a  velocity  held  of  reference.  This 
leads  to  mass-conserving  estimates  if  the  reference  velocity  held  is  conser¬ 
vative  since  the  geostrophic  shear  already  satisfies  continuity.  If  reference 
or  external  barotropic  velocities  are  provided  at  open  boundaries,  a  Pois¬ 
son  equation  can  be  formed  for  a  transport  streamfunction  by  taking  the 
curl  of  this  barotropic  velocity.  Solving  for  the  transport  streamfunction  is 
then  straightforward  for  domains  without  any  islands.  For  complex  coastal 
regions  with  islands,  the  same  Poisson  equation  can  be  solved,  imposing  a 
hxed  transport  streamfunction  value  around  each  island.  The  result  con¬ 
serves  mass  by  construction.  Details  are  provided  in  App.  2.2  of  (Haley  and 
Lermusiaux,  2010)  for  both  rigid-lid  and  free-surface  primitive  equations. 

In  the  case  with  islands,  a  hrst-guess  at  the  streamfunction  along  each 
island  coast  can  be  obtained  by  sinking  the  islands  to  a  shallow  depth,  solv¬ 
ing  for  the  corresponding  streamfunction  and  averaging  its  values  along  each 
island  coast.  However,  we  found  that  some  of  the  resulting  inter-island  trans¬ 
ports  can  be  unrealistic,  often  much  too  large.  Hence,  Haley  et  al.  (2011) 
derived  a  methodology  to  correct  for  this.  Specifically,  they  optimize  the 
somewhat  known  inter-island  transports  (i.e.  add  a  least-square  penalty  to¬ 
wards  these  values)  and  minimize  the  unknown  ones.  These  optimized  island 
transport  streamfunctions  are  then  used  as  Dirichlet  boundary  conditions 
in  the  Poisson  equation.  The  result  is  a  mass-conserving  geostrophic  flow 
in  balance  with  the  hydrographic  OA  maps  and  with  optimized  inter-island 
transports.  This  methodology  was  illustrated  in  Sect.  6.2. 

Summarizing  the  inter-island  transport  optimization,  the  objective  is  to 
find  a  set  of  constant  values  for  (T)  along  the  island  coastlines  that  produce  a 
suitably  smooth  initialization  velocity  held,  e.g.  with  no  unrealistically  large 
velocities.  In  the  unknown  straits,  the  goal  is  to  minimize  the  kinetic  energy 
or  the  maximum  absolute  velocity.  The  working  assumptions  are: 

1.  Coastlines  in  the  given  domain  can  be  divided  into  two  distinct  subsets: 

(a) .  Set  A:  N  coastlines  along  which  the  transport  streamfunction  is 
unknown,  N  ^ 

(b) .  Set  B:  M  coastlines  along  which  the  transport  streamfunction  is 
known. 

2.  A  hrst-guess  To  exists  for  the  case  with  coasts  in  set  B,  but  no  coasts 
in  set  A,  i.e.  these  coasts  and  their  corresponding  interiors  are  replaced  by 
open  ocean  (e.g.  island  sunk  to  10m  depth). 

3.  The  difference  between  the  hrst-guess  Tq  and  the  hnal  solution  T  is  not 
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extremely  large.  Otherwise,  the  information  from  'I'q  would  not  be  accurate 
enough. 

'ho  contains  useful  information  such  as  the  position  of  major  currents 
relative  to  various  coastlines  and  the  effects  of  topography  on  the  flow.  Thus, 
To  can  be  used  to  estimate  T  along  the  other  island  coastlines  by  constructing 
an  optimization  functional  for  minimizing  (in  general  optimizing)  the  inter¬ 
island  transports  subject  to  weak  constraints.  The  optimization  functional 
(E)  is  constructed  as  follows.  Its  general  form  is  divided  into  a  summation 
of  three  terms,  given  by: 


E  —  E\  +  E2  +  E3 


(27) 


where,  E\  is  the  minimizing  target  for  the  transport  between  all  pairs  of 
the  unknown  (Set  A)  coasts,  E2  is  the  minimizing  target  for  the  transport 
between  all  pairs  of  unknown  (Set  A)  and  known  (Set  B)  coasts  and  E3 
is  the  minimizing  target  for  the  transport  between  all  pairs  of  the  unknown 
(Set  A)  coasts  and  the  open  boundaries  of  the  domain.  The  minimum  of  E  is 
computed  by  solving  a  standard  least  square  problem,  i.e.  by  setting  gradients 
with  respect  to  the  unknown  T  values  equal  to  zero.  These  streamfunction 
values,  which  smooth  the  velocity  field,  are  then  used  as  Dirichlet  boundary 
conditions  to  the  final  Poisson  equation. 

The  expressions  for  E\ ,  E2  and  E3  are  provided  in  Haley  et  al.  (2011). 
They  require  the  use  of  appropriate  weights:  wnm  for  the  pair  of  islands 
denoted  here  by  subscripts  n  and  m.  These  weights  are  computed  using 
a  FMM  scheme.  Specifically,  consider  the  stream  function  (T)  for  a  two- 
dimensional  horizontal  flow.  It  is  defined  such  that  the  flow  velocity  can  be 
expressed  as: 


u  =  ( u ,  v ) 


x  T/c  u 


1  1  9T 

E[  dy,L  H  dx 


(28) 


Here,  E[  is  the  ocean  depth.  The  transport  between  a  pair  of  islands  having 
streamfunction  ?/q  and  0 2  is  given  by: 


■02  —  fpi  —  /  u.ndA  (29) 

J  A 

where,  A  is  the  vertical  area  between  the  two  islands  and  n  is  the  unit  vector 
normal  to  the  vertical  area.  Equations  28  and  29  suggest  that  the  appropriate 
weight  function  to  optimize  the  velocity  field  should  be  wnrn  =  1/A^m,  where, 
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Anm  is  the  minimum  vertical  area  along  any  path  between  the  two  islands 
n  and  m.  Another  heuristic  choice  of  weight  function  can  be  wnm  =  1  / d^im , 
where  the  dnm’s  are  mean  depths.  We  found  this  choice  only  appropriate 
when  the  depth  is  almost  uniform  in  between  each  pair  of  islands  (■ n,m ).  In 
general,  this  is  not  the  case  and  we  thus  needed  to  compute  the  minimum 
areas  Anm.  Using  the  FMM,  as  described  in  Sect.  4.2,  is  a  very  convenient 
and  efficient  way  to  compute  these  Anm’ s.  Simulations  (Agarwal,  2009)  have 
been  performed  with  several  other  weight  functions  and  they  confirmed  that 
the  choice  of  weights  wnm  =  1  /A'fim  lead  to  the  most  accurate  flow  fields. 
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Figure  1:  Examples  of  optimal  shortest  sea  paths  computed  using  the  Level 
Set  Method  in:  (Top  -  Left)  Monterey  Bay;  (Top  -  Right)  Massachusetts 
Bay;  (Bottom  -  Left)  Dabob  Bay;  (Bottom  -  Right)  Philippines  Archipelago. 
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Figure  2:  Temperature  (°C)  (Top  -  Left)  and  Salinity  (PSU)  (Top-Right) 
data  in  Dabob  Bay.  OA  fields  for  this  Temperature  (°C)  (Left)  and  Salinity 
(PSU)  (Right)  in  Dabob  Bay  from  the  optimal  path  length  computed  using: 
(Middle)  Bresenham-based  line  algorithm;  (Bottom)  Fast  Marching  Method, 
clearly  showing  the  issues  of  the  Bre.4§iham- based  line  algorithm. 


Figure  3:  (Top  -  Left)  World  Ocean  Atlas  2005  Climatology  in  situ  tem¬ 
perature  (°C)  at  0m.  Temperature  (°C)  OA  Fields  obtained  using:  (Top 
-  Right)  Standard  OA  without  taking  islands  into  account;  (Bottom  -  Left) 
Fast  Marching  Method;  (Bottom  -  Right)  SPDE  approach  (representing  field 
by  a  stochastically  forced  Helmholtz  Equation). 
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Figure  4:  (Top  -  Left)  World  Ocean  Atlas  2005  Climatology  in  situ  temper¬ 
ature  (°C)  at  200.0m.  Temperature  (°C)  OA  Fields  obtained  using:  (Top 
-  Right)  Standard  OA  without  taking  islands  into  account;  (Bottom  -  Left) 
Fast  Marching  Method;  (Bottom  -  Right)  SPDE  approach  (representing  held 
by  a  stochastically  forced  Helmholtz  Equation). 
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Figure  5:  (Top  -  Left)  World  Ocean  Atlas  2005  Climatology  in  situ  temper¬ 
ature  (°C)  at  1000.0m.  Temperature  (°C)  OA  Fields  obtained  using:  (Top 
-  Right)  Standard  OA  without  taking  islands  into  account;  (Bottom  -  Left) 
Fast  Marching  Method;  (Bottom  -  Right)  SPDE  approach  (representing  held 
by  a  stochastically  forced  Helmholtz  Equation). 
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Figure  6:  (Top  -  Left)  World  Ocean  Atlas  2005  Climatology  in  situ  Salinity 
(PSU)  at  0m.  Salinity  (PSU)  OA  Fields  obtained  using:  (Top  -  Right)  Stan¬ 
dard  OA  without  taking  islands  into  account;  (Bottom  -  Left)  Fast  Marching 
Method;  (Bottom  -  Right)  SPDE  approach  (representing  field  by  a  stochas¬ 
tically  forced  Helmholtz  Equation). 
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Figure  7:  (Top  -  Left)  World  Ocean  Atlas  2005  Climatology  in  situ  Salin¬ 
ity  (PSU)  at  200.0m.  Salinity  (PSU)  OA  Fields  obtained  using:  (Top  - 
Right)  Standard  OA  without  taking  islands  into  account;  (Bottom  -  Left) 
Fast  Marching  Method;  (Bottom  -  Right)  SPDE  approach  (representing  held 
by  a  stochastically  forced  Helmholtz  Equation). 
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Figure  8:  (Top  -  Left)  World  Ocean  Atlas  2005  Climatology  in  situ  Salin¬ 
ity  (PSU)  at  1000.0m.  Salinity  (PSU)  OA  Fields  obtained  using:  (Top  - 
Right)  Standard  OA  without  taking  islands  into  account;  (Bottom  -  Left) 
Fast  Marching  Method;  (Bottom  -  Right)  SPDE  approach  (representing  held 
by  a  stochastically  forced  Helmholtz  Equation). 
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Figure  9:  Difference  between  Temperature  (°C)  field  at  Level  =  1000m  ob¬ 
tained  using  Fast  Marching  Method  and  using:  (Top  -  Left)  Standard  OA; 
(Top  -  Right)  SPDE  (representing  field  by  Helmholtz  equation).  Difference 
between  Salinity  (PSU)  field  at  Level  =  1000m  obtained  using  Fast  Marching 
Method  and  using:  (Bottom  -  Left)  Standard  OA;  (Bottom  -  Right)  SPDE 
(representing  field  by  Helmholtz  equation). 
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Figure  10:  Velocity  estimation  under  geostrophic  balance  and  optimized 
inter-island  transports  (weight  functions  based  on  the  minimum  vertical  area) 
from  hydrographic  held  maps  (WOA05)  obtained  using  the  FMM  (Left) 
and  using  the  SPDE  Approach  (Right):  (Top)  Streamfunction,  Velocity  at 
depths:  (Middle)  0m;  (Bottom)  100m. 
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Figure  11:  Locations  of:  (Top)  Melville  exploratory  cruise  and  glider  data 
(Summer  2007)  and  (Bottom)  Melville  joint  cruise  Data  (Winter  2008),  both 
in  the  Philippines  Archipelago. 
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Summer  2007 


Figure  12:  Temperature  (°C)  OA  Fields  at  Orn  (Left)  and  200m  (Right) 
using  the:  (Top)  Melville  exploratory  cruise  and  glider  data  (Summer  2007); 
(Bottom)  Melville  joint  cruise  data  (Winter  2008).  Colorbars  are  not  the 
same  for  the  two  periods  due  to  the  winter  and  summer  variability. 
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Figure  13:  Salinity  (PSU)  OA  Fields  at  Om  (Left)  and  200m  (Right)  using 
the:  (Top)  Melville  exploratory  cruise  and  glider  data  (Summer  2007);  (Bot¬ 
tom)  Melville  joint  cruise  data  (Winter  2008).  Colorbars  are  not  the  same 
for  the  two  periods  due  to  the  winter  and  summer  variability. 
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Figure  14:  Chlorophyll  (//mol /Kg)  OA  Fields  using  the  FMM  at  Level:  (Top 
-  Left)  Om;  (Top  -  Right)  10m;  (Bottom  -  Left)  50m;  (Bottom  -  Right)  160m. 
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Figure  15:  (Top  -  Left)  World  Ocean  Atlas  2005  (Spliced  February  and  Win¬ 
ter  Climatology)  in  situ  temperature  (°C)  at  0.0m;  Temperature  (°C)  OA 
Fields  using  the  Fast  Marching  Method  at  the  surface  (0m)  using  the  follow¬ 
ing  scheme  and  scales:  (Top  -  Right)  First  order  FMM  and  Lq  =  540km, 
Le  =  180km;  (Bottom  -  Left)  First  order  FMM  and  L0  =  1080km, 
Le  =  360km;  (Bottom  -  Right)  ffigher  order  FMM  and  L0  =  1080km, 
Le  =  360km. 
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Figure  16:  Example  of  an  idealized  (multiply-connected)  domain  having  an 
island. 
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Figure  17:  Temperature  (°C)  OA  Fields  at  the  surface  (Om)  (scales  L0  = 
1080km,  Le  =  360km)  using  the:  (Top  -  Left)  FMM;  (Top  -  Right)  FMM 
and  removal  of  problematic  data;  (Bottom  -  Left)  FMM  and  introducing 
process  noise;  (Bottom  -  Right)  FMM  and  applying  dominant  singular  value 
decomposition  (SVD)  of  a-priori  covariance. 
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